I've encountered a problem with Proc Export. I've got a large dataset in SAS with thousands of columns. I want to export it so I can use it in another program, but when I inspect the file, no matter the type I've exported to (XLS, CSV, TXT), the column names just disappear at around column 3500. Then, the data slowly starts to disappear. Rows at a time go missing, more and more, until just one row of data is left. This is definitely different from the data as it appears in SAS.
As noted above, I tried several different file types, but there was no change. I also tried opening the tab-delimited file in a text editor, thinking maybe it was an Excel problem, and again, the column names just disappear at the same place. I receive no error from SAS during the process of the export - it acts like everything went fine.
Any ideas? I'd really like to get this data out, and especially in this exact configuration (ie no converting to long instead of wide).
My SAS version is 9.2.
asked 17 Jul '11, 00:18
It looks like you're running into the limit set on the LRECL option.
The following code will give you about 5.7K columns into a csv:
The following notes are issued to the log (after around 40K other lines!!):
If we use an explicit filename, we can set the logical record length to accomodate the line lengths:
This time, no notes are written to the log and all columns appear in the output.
It look to me like proc export is setting LRECL to MAX (aka 32K). This is a touch misleading, the maximum record length you can write is 1 GB for SAS9.2 (1MB for SAS9.1).
Hope this helps! - Will Dobson (who hasn't figured out how to sign into the site with is old account since it changed! ¬_¬)
answered 01 Aug '11, 06:41
Thanks Will. (And sorry it's taken me so long to comment back. It seems it's been so long that RunSubmit doesn't even want me to sign in anymore, so I had to start new.) Weirdly, even though SAS seems to acknowledge the filename statement because I get the filename I specify in it, the LRECL statement appears to be skipped. I say this because after running your code above with the explicit filename, I get the same strangely dwindling CSV after exporting is done, and I get these notes in the log:
You can see that it's plain ignoring my LRECL option and telling me it's going to go with 32K instead. Any ideas as to why?
Edit: Will, I just tried with your created dataset and your export code, and still did not get a complete dataset. I got the same Notes in the Log about LRECL=32767. It seems obvious now that something is going on with my SAS, even though it is version 9.2. It really does not seem to be acknowledging the LRECL option, but I have no idea why.
Edit2: This seems to be an issue with 9.2. Will, are you using 9.1? I found mention here (http://www.mathkb.com/Uwe/Forum.aspx/sas/43808/SASware-Ballot-Change-LRECL) of this fix working with V9.1, but not with V9.2. How frustrating! I will open a ticket with SAS.
I received an answer from SAS, and it is ridiculous, but I think it worked.
Basically, the 32k limit for LRECL exists with PROC EXPORT in V9.2. I have no idea why. SAS told me that in order to get around this, I need to run the Data Step.
"The Data Step" is the massive, huge step that is spit out in the Log when you run Proc Export. Basically, the step that you are trying to avoid by running Proc Export in the first place. The step you are ESPECIALLY trying to avoid if you happen to have a dataset that needs an LRECL longer than 32k. Nonetheless...
You take this Data step from the Log. Of course, there are line numbers throughout it. Maybe you can turn these off, I'm not sure, but instead of bothering to find out, I saved the log to a text file using proc printto, then opened the text file in Excel, then adjusted the settings upon import so that all of the Line numbers were in Column A of the Excel sheet, with the rest of the stuff (the data step) in Column B. Then I deleted the line numbers.
Next, I copy/pasted that (apparently I'm stupid with Excel because whenever I tried to just save it as a text, it would put quotes around each line, whereas copy/paste the Excel column into Notepad gave me just the text of the datastep) into an Editor window in SAS. Now, the Data step calls the 'filename' statement that Will used above, so that must be maintained. So, the code will look like this (I deleted the huge, long, 10000s of lines of code for placing variables):
So, I ran that. First, SAS bombed out because I was out of memory. Rebooted, refrained from unnecessary memory gobbling programs upon startup, and ran it again.
I definitely got a much, much bigger file out of it, so big that I can't check it in Excel to make sure it's all there. I opened it in Notepad and it certainly looks all there, or at least more "all there" than it did before. A summary() and a fix() in R seems to indicate that the file is indeed all there.
So, should you have the misfortune of needing to get a HUGE file out of SAS V9.2, that's how you do it.
answered 01 Sep '11, 16:37
How about using ODS instead of proc export (if that's an option).
Here's an example using ODS csvall:
This gives the full length of output (run against the example in my post above), and (in SAS9.1 at least) runs in about 1 second. For comparison, both the export steps above take between 5 to 7 minutes to run on my machine!
A few other options are detailed in this rather useful post: http://studysas.blogspot.com/2009/02/how-to-create-comma-separated-file-csv.html
Timings from both the dataset examples in the above post are also around 1 the second mark, so ODS export doesn't look too promising!
There are differences in the output of all these methods such as:
So you'd need to tweak options accordingly to ensure that the output is as required.
Hope this helps a bit more than the nightmare code from above!
answered 20 Sep '11, 10:21