-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_stata issue #11526
Comments
try with encoding='utf-8' and see if that works |
Same error. |
this seems peculiar to your case and out of scope of pandas |
I have the same problem with DTA files generated by SAS. Stata users have no problem opening these files, and neither do I if I use R, but I can't open them up using pandas (v 0.18.1) |
@torstees Can you provide a reproducible example? (eg example file that fails to open) |
Here you go. I included a minimal python and R script along with the output I'm seeing on current OSX and a recent linux system. The dataset was exported with the current version of SAS using proc export. It contains a simple list of ids and nothing else. |
There is a huge probability that the stata file I do use are generated by SAS since it is the main software used by the french statistical institute. I tried @torstees data and got the same error. |
The dta file format code for the file supplied by @torstees is 111. According to the R docs here: https://stat.ethz.ch/R-manual/R-devel/library/foreign/html/read.dta.html format 111 corresponds to Stata 7SE. But the SAS docs linked below state that SAS writes dta files compatible with Stata 8 and later. http://support.sas.com/documentation/cdl/en/acpcref/63184/HTML/default/viewer.htm#a003103776.htm There is no mention of format version 111 in the Stata dta format docs: |
I also noticed that in the References section below: https://stat.ethz.ch/R-manual/R-devel/library/foreign/html/read.dta.html they state that the spec for dta's written by Stata 7 is contained in the printed programming manual. I can't find it on-line. |
The small SAS program below exports a Stata dta file. You can then use Stata to check its version, with
Here is the SAS program (you need to have a small "tmp.csv" file in the working directory):
|
I have the following error when reading a bunch of stata file
I do not know the way to check the stata version of the file but I suspect it is fairly recent.
Opening the file with stata 12 and saving it solves the problem.
But I do not have stata installed on my machine and I cannot do that everytime.
How can I check the version of the file ?
Is there anybody kind enough to have a look at the problematic file (personnal email please) ?
I suspect encoding problem. data is a sample of a french survey with many string with accent etc.
Thanks for help
The text was updated successfully, but these errors were encountered: