Assignment #1 Frequently Asked Questions (FAQ)

Q1. What is the purpose of this page?

Q2. The first time I create a data connection, everything works fine. The second time, I get an error about the address being in use. What's going on?

Q3. I'm having trouble while read()ing from a socket. My program just hangs.

Q4. The IP address field is still zero after the bind call. Can I send this address in the PORT command as it is?

Q4. How do I create a file with the proper permissions?

Q6. How do I create a new directory in C?

Q7. Can I use bzero() or bcopy() even though they technically aren't ANSI C?

Q8. How do I handle symbolic links?

Q9. Do the total number of bytes reported include those given by the NLST command?

Q10. Some sites give errors for the STRU and MODE commands? For those sites, can we assume that the data structure and the transfer mode are by default a file structure and a stream transfer mode?

Q11: Can I use LIST *.gz to get the list of files with the .gz extension?

Q12: What happens if my machine has two IP interfaces (or ethernet cards)? Which one do I use?

Q13: With some servers like ftp.fedworld.gov or klamath.stanford.edu NLST does not show any directories, only files? or why can't I use NLST?

Q14: What happens if I encounter a non-RFC compliant server?

Q15: I am having problems reading the filenames provided by ftpparse. What am I doing wrong?

Q16: Some servers I try don't seem to end lines in their LIST output with "\r\n", using only "\n" instead. This breaks the sample code from the assignment description! What am I doing wrong?

Q17: What files does the extension "gz" match? *.gz, *.tar.gz, *.tgz?

Q18: How do I determine my local IP address?

Q19: Can I assume that server responses over the control connection are no longer than 1024 bytes?

Q20: How important is performance? Should I thread my code?

Q21: I'm sending only NVT ASCII and binary data over the control and data connections. Do I still need to use the htonX()/ntohX() functions anywhere?

Q22: So, what is a symbolic link anyway?


Q1. What is the purpose of this page?

A. This page is a list of the most frequently asked questions (hence the name FAQ) for the current homework assignment (HW#1). Before sending e-mail to your TA with any question regarding specification or implementation details of the assignment, you should check this FAQ page to see if the question has already been answered. If the question is not on the page, or if the explanation on the page is unclear, feel free to e-mail your TA with the question.

Q2. The first time I create a data connection, everything works fine. The second time, I get an error about the address being in use. What's going on?

A. You are running into a behavior of TCP called the 2MSL timeout. (You will hear more about this in class). Basically, TCP protects against using the same IP and port address in the same process for a certain amount of time after the connection is closed. This prevents very late packets coming in after a connection is closed from being associated with a new connection having the same address.

An easy solution would be to call bind() again after setting the sin_port to zero, so that the bind() call finds a free port for you.

Alternatively, you can use the same technique ftp uses to get around this problem. Pick a range of port addresses (ftp uses 40000 through 40099 by default). Try to create the connection with each one in turn until one succeeds. If you've already made a data connection earlier, start the scan on the port address following the one you used last time, wrapping around at the end. Be sure to handle the unlikely case where none of the ports work!

Q3. I'm having trouble while read()ing from a socket. My program just hangs.

A. Read on a blocking socket will block when there is no data left to read. You should read the FTP RFC for a method to deal with this situation. The shutdown() method used in the example is not the right way to handle this problem, you can not shutdown a socket then reopen it. Possible solutions would be, (1) you can use non-blocking sockets (which we have not discussed) or (2) use blocking sockets but send one command at a time, and read the reply character by character while parsing it to know when the reply is complete. If you are using a blocking socket and you read the full reply then try to read again, your program will hang indefinitly since the server will not send you more characters until you send it a new command.

Q4. The IP address field is still zero after the bind call. Can I send this address in the PORT command as it is?

A.No. The bind call only fills in the port address (if you wanted it to). You should fill in the correct IP address for the PORT command.

Q4. How do I create a file with the proper permissions?

A.Try: open(path, O_WRONLY | O_CREAT | O_TRUNC, S_IREAD | S_IWRITE)

Q6. How do I create a new directory in C?

A.Try: mkdir(path, S_IREAD | S_IWRITE | S_IEXEC). For more information use man 2 mkdir on xenon or man -s 2 mkdir on Sweet Hall machines.

Q7. Can I use bzero() or bcopy() even though they technically aren't ANSI C?

A.Yes

Q8. How do I handle symbolic links?

A. You don't. If ftpparse() indicates that a directory entry is a symbolic link, you can just ignore it during processing.

Q9. Do the total number of bytes reported include those given by the NLST command?

A. No, only those reported with the use of RETR command.

Q10. Some sites give errors for the STRU and MODE commands? For those sites, can we assume that the data structure and the transfer mode are by default a file structure and a stream transfer mode?

A. If you use the STRU and MODE commands, please, bear in mind that some ftp sites do not implement them at all, and they will return an error message. For those sites, you can safely assume that the default values for the file structure and the transfer mode are the one the you need. A. See the cases of D2-D4-D7 and D5-D8 and point 3.b (right after the example) in the assignment specification.

Q11: Can I use LIST *.gz to get the list of files with the .gz extension?

A. LIST with a regular expresion is not a standard feature, so it will not work with all ftp sites. So you should not be using it.

Q12: What happens if my machine has two IP interfaces (or ethernet cards)? Which one do I use?

A. You can use bind() to select one of them. In any case, you can assume that the machine that we are going to be testing your program with has a single interface.

Q13: With some servers like ftp.fedworld.gov or klamath.stanford.edu NLST does not show any directories, only files? or Why can't I use NLST?

A. One of the most common ftp servers in the Internet, WU-FTPD, starting with version 2.6, changed the behaviour of NLST (see http://www.landfield.com/wu-ftpd/mail-archive/wuftpd-questions/2000/Jul/0035.html or http://www.landfield.com/wu-ftpd/mail-archive/wuftpd-questions/2000/Dec/0135.html ). The current version of the server does not return any directories with NLST. In fact it only returns the files that can be retrieved successfully via a RETR. This new "feature" (where directories are not included) breaks any attempt to do a recursive "mget -r *" using NLST, as we are trying to do with the assignment.

It is unfortunate that the development team of WU-FTPD decided to add this new feature, specially since LIST does not return a standardized response. Fortunately, as described on the assignment page, it is possible to parse most common LIST formats. For the purposes of this assignment, you are required to use LIST; you may not use NLST. However, you do not have to worry about LIST formats that are not covered by the sample library we have suggested.

Q14: What happens if I encounter a non-RFC compliant server?

A.We only require you to have your ftpcopy client work with RFC-compliant servers. If you see a server that does not comply to the FTP RFC, please let your TA know, so that we can tell your fellow students.

Q15: I am having problems reading the filenames provided by ftpparse. What am I doing wrong?

A.Most probably you are getting a filename that is not terminated with a '\0', and then the standard c-string libraries break. One simple solution is to copy the filename to a new string that is one character longer and then write the '\0' in the extra character.

Q16: Some servers I try don't seem to end lines in their LIST output with "\r\n", using only "\n" instead. This breaks the sample code from the assignment description! What am I doing wrong?

A. Be sure that you correctly set the transfer mode (ASCII vs. BINARY) before issuing the LIST command. See RFC 959 for more details.

Q17: What files does the extension "gz" match? *.gz, *.tar.gz, *.tgz?

A. It matches any file that ends with a ".gz", i.e. file1.gz or file2.tar.gz match because of the trailing ".gz", but file3.tgz does not.

Q18: How do I determine my local IP address?

A. There are a few ways of doing this. Page 250 of Stevens contains one example based on gethostbyname() of the local system.

Q19: Can I assume that NVT ASCII strings are no longer than 1024 bytes? Can I assume that server responses over the control connection are no longer than 1024 bytes?

A. You may assume that NVT strings read over the control connection are no longer than this length (i.e. a single line of a response), and you do not need to handle servers that do not meet this assumption. However, the total length of a server response (e.g. a greeting) may exceed this length. Note that strings sent over the data connection should not be limited, since a LIST string can be of arbitrary length.

Q20: How important is performance? Should I thread my code?

A. In terms of performance, just use common sense in your design decisions. We do not expect you to thread your code--and in fact the grading scripts do not support this functionality in student programs!

Q21: I'm sending only NVT ASCII and binary data over the control and data connections; my program doesn't transfer integer values at any point. Do I still need to use the htonX()/ntohX() functions anywhere?

A. Yes. While it's true that you don't (and shouldn't) use these functions for the non-numeric data you transfer in your program, an idiosyncrasy of the sockets API is that integer values communicated to remote machines via the socket layer must be specified in network byte order. This includes the port specified to connect(). (This will make more sense after we cover TCP later on in the course). Refer to the Stevens book for more details.

Q22: So, what is a symbolic link anyway?

A. A symbolic link is essentially a pointer to another file. You can read more about them at http://en.wikipedia.org/wiki/Symbolic_link



Last modified: Mon Jan 23 01:11:46 PST 2006