2014년 12월 29일 월요일

Problem parsing CSV Text to a List


I am having an issue in converting a list of comma separated values into a list.
In my file M.txt, I have the following as the content. Each line is terminated by CRLF, ie. \n

M01:By Theme\n
M02:By Author\n
M03:By Title\n
M04:By First Line\n

When I read, I then replace the '\n' by ',' which works fine.

But when I feed the whole thing (each item now terminated by comma, the parsing to a list fails at runtime saying it can't parse the text to a list. No further helpful info is given).

Attached is my block.

--
The error I am getting is:
Cannot parse text argument to "list from csv row" as CSV-formatted row

--
Why don't you try the list from csv table block, and forget about the replacements?  That should take the csv text file and convert it directly to a list.

--
The problem is that if I use the csv table block, it will have the result in multiple dimensions. But I am looking to get a simple list of values.
I am not able to see any good example on this and the documentation is too simple.

--
Well.. I finally found out the problem..
Because I had CRLF instead of LF as the new line character, it chocked on it..
Once I changed to LF, the list from csv row now works..
It appears that the documentation is very poor.

--
this looks like a bug for me, see the Do it result and example project attached
I would have expected to get a1,a2,b1,b2,c1,c2 as result...

But why don't you like to work with a list of lists?
If you prefer to work with a simple list, just remove the new lines in the text file itself...

--
Thanks for reporting this.  And Taifun, thanks for looking into it.  I'll file this as something for the dev team to work on.

--
I took a look at this, and I don't know what to do about it, if anything.

The problem is that the string being passed to list-from-CSV-rew has unescaped CR characters in it, just as Bhagavati says.  IS the confusion that a "\n" shows in the value balloon as a return, or is there some other problem?

--
I don't believe that is an issue.
The problem is how the text file is created.
Windows adds CRLF to end of line.
Linux does only LF.
replacing LF with "," as shown in thread, WILL NOT get rid of CR.
If your file was created with LINUX or MAC, you would not experience this issue. It happens only on windows.

Take the attached foo.txt, DO NOT perform any modification, upload to your test .aia and perform same action. It should work with no problem.

For new text files, use notepad++ and open the file. On the toolbar of notepad++, turn on the paragraph marker (see attache image) and do the replacement exactly how shown on the image.



--
Hossein is right to remind people about the CRLF versus LF issue in Windows vs. Mac and Linux.  That's a problem that's plagued people for decades.

The other issue is that App Inventor treats \n in strings specially, so that the \n gets interpreted as a single newline character (as in Java).    For example, 

But \n is the only character that is treated specially this way, unlike Java, which has other "escaped characters".

--
IS the confusion that a "\n" shows in the value balloon as a return, or is there some other problem?
I don't understand this, which value balloon are you talking about?

From a user point of view it doesn't matter if a file was created with Windows or Linux/Mac, the user only wants to get rid of the invisible newline character (be it CRLF or LF only)
so this currently works only for Linux/Mac files (LF), but not for Windows files (CRLF)...
If I use the replace all block to replace \n, I would expect to get the newline character(s) removed (be it CRLF or LF only)

according to this report, Windows has a market share of ~90%...

--
Taifun has a good point.   Do you think we should change the parser in Yail.js so that \n\r will generate the same thing as \n ?

--

I agree with Taifun that App Inventor's \n should match Window's CRLF (which is \r\n, not \n\r, BTW) whenever possible. 

The problem is that we can't always know when to do The Right Thing. 

When reading files/strings created in Windows, we can treat every \r\n as if it were \n, and this will make a lot of things work (e.g. using the replace block with \n) that don't currently work. 

But when writing \n in a file/string, we have no idea whether that file/string will end up being used in Windows. So the best we can do is simply write the \n, but this won't show up correctly when viewed in Windows. 

--
Oh, "But when writing \n in a file/string, we have no idea whether that file/string will end up being used in Windows," then let the Windows user/developer format it correctly on receipt from an Android app?  

--
Postel's Law  would advise accepting both LF and CRLF but emitting just LF.
Since Android is a member of the unix family, the common convention on that platform would be to emit LF.

Back when I was scripting file transfers among mainframes, Windows, and unix,
the FTP and FTPD programs handily did the required text stream conversions based on their knowledge
of their parent operating systems.

I agree with SteveJG on this.
Keep it simple.

--
to follow Postel's Law sound like a good idea...

--
This has been added to the list of issues to assign,  Issue #208 on Github


--

Thank you.  This is a larger issue than Mac users realize.   :)


--

 if you want a simple, one-dimensional list, just format your text file like this:

M01,By Theme,M02,By Author,M03,By Title,M04,By First Line

I would not expect "list from CSV row" to return simple one-dimensional list if there are any newlines or carriage returns found in the text, outside of commas/quote delimited fields (which can accept newlines within).

If, on the other hand, you want a structured association between M01 and "By Theme", you need "list from csv table" as Enis points out below. Of course, this will create the multiple dimensional array, as you say. 

I've also tested this with a Windows and Unix formatted input file - and both work. See attached newlines.aia.

Perhaps there is a bigger issue with saving of files on Windows, but this does not seem to apply to your case.

newlines.aia

--

댓글 없음:

댓글 쓰기