Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in URL replace - possibly due to multiple #14

Open
joehobson opened this issue Nov 3, 2015 · 1 comment
Open

error in URL replace - possibly due to multiple #14

joehobson opened this issue Nov 3, 2015 · 1 comment
Assignees

Comments

@joehobson
Copy link
Member

The plugin seems to have problems processing this page: http://www2.ed.gov/programs/skillssuccess/awards.html, which is mirrored on our test here: http://oii.wp-test.navnorth.com/what-we-do/innovation/skills-for-success/awards/

We were originally told that "the URLs for the attachments are repeating" but when I looked into it I only found a problem with the PDFs for IDEA RAISES Student Achievement and Perseverance Process Project, where the link was mangled to something like this: http://www2.ed.govhttp://www2.ed.gov/programs/skillssuccess/2015unfunded/ideaabst.pdf

My guess is that it's because there are 2 instances of the same link in the page. I'm not sure why this would cause a problem, but it might be the case. It's doing the same on our test server so see what you can do to fix it. Thanks

@johnpaulbalagolan
Copy link
Contributor

fixed #14 with the latest source committed to github. took me a while to figure this out as I had to trace the scraping code and have had intermittent issue with testing the scraping locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants