QUOTE
I'm not sure I understand the statement "to print the byte values one by one; or just use a utility like od to look at the source text." Can you be more specific in terms of what this will do and the reason for it?
Yes, the basic point is thatany text handled in software is *actually* a sequence of bytes. It is *not* a sequence of squiggles drawn on a screen. In the good old (simple) days, the bytes were ASCII codes for enough characters to write (semi-literate) English and programs in C or the like; now typically the bytes are actually an encoding of Unicode character representations, of which there are far too many to remember, and which include all sorts of control codes that probably don't mean anything. So you need to be able to look at these codes.
If you have a source text file (not "Excel" or whatever), you can look at it with a dump utility such as od (Google od+utility: eg
http://www.linuxjournal.com/article/1326 )
Or you can write a little bit of debugging code in your php program, using ord() to get the numeric values of the string:
http://jp2.php.net/manual/en/function.ord.phpIf you get a sequence of more numbers than there are ASCII characters, you know there is some extre crud in there, which you can remove with a bit of code in the importing program.
If you have a source text file with the data, you could put it here as an attachment (perhaps zip it first, so it can't be "corrected" by some "helpful" software along the line).