Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for nested json #15

Closed
sahin opened this issue May 3, 2016 · 13 comments
Closed

Support for nested json #15

sahin opened this issue May 3, 2016 · 13 comments
Assignees

Comments

@sahin
Copy link

sahin commented May 3, 2016

what a great lib, I moved nearly 40 different parsers with your library less than 2 hours and huge decrease in loc. :)

scrapeIt(url, {
    provider: ".powered-by"
    , status: {
        description: ".page-status span.status"
    }

}).then(page => {
    console.log(page);
});

I am getting

                   let value = typpy(cOpt.how, Function) ? cOpt.how($elm) : $elm[cOpt.how]();
                                                                                 ^

TypeError: Cannot read property 'text' of undefined
``
@IonicaBizau
Copy link
Owner

Hmm, interesting. What is the url or HTML when you're getting this? That simply means the $elm variable is undefined (not really sure why).

Can you provide the HTML?

Glad to hear it's helpful! 😁

@IonicaBizau IonicaBizau added the bug label May 3, 2016
@sahin
Copy link
Author

sahin commented May 3, 2016

'https://status.airbrake.io'

@IonicaBizau for now, I solve the error with a simple function, I simple did.

  scrapeIt(url, {
            status: {
                selector: ".page-status span.status",
                convert:  description => parseStatusCurrent(description)
            }
            , provider: {
                selector: ".powered-by"
                , convert: (function (text) {
                    if(text == "Powered by StatusPage.io"){
                        return "StatusPage.io";
                    }
                    else {
                        return "Unknown";
                    }
                })
            }
        }).then(status => {
            status.updated_at = new Date().toISOString();
            status.url = url;
            console.log(status);
            resolve(status);
        });

and 
parseStatusCurrent


export function parseStatusCurrent(description){
    var status = {};
    status.description = description.trim();
    if(status.description == 'All Systems Operational'){
        status.indicator = "Operational";
        status.color = "green";
    }
    else if(status.description.indexOf('Minor')>-1){
        status.indicator = "Not Fully Operational";
        status.color = "yellow";
    }
    else {
        status.indicator = "Not Fully Operational";
        status.color = "red";
    }
    return status;
}

@IonicaBizau
Copy link
Owner

Ah, I see what you mean. I think in this case, your implementation good. There is support for nested lists. For nested objects I'm not sure how the syntax would look like. 💭

@sahin
Copy link
Author

sahin commented May 3, 2016

how about "child" ?

 status: { 
           child:
             {
                    description: 'description'
             }
}

@IonicaBizau
Copy link
Owner

Or maybe instead of child, just data, like for nested lists?

@sahin
Copy link
Author

sahin commented May 3, 2016

or basically strict the words like listItem, selector,, convert and the rest will be white listed and as is.

@IonicaBizau IonicaBizau self-assigned this May 3, 2016
IonicaBizau added a commit that referenced this issue May 3, 2016
@IonicaBizau
Copy link
Owner

@sahin This is now possible in >=2.2.0. Check out this example.

@sahin
Copy link
Author

sahin commented May 7, 2016

@IonicaBizau this is very very good.

it might be a good idea to create a wiki page with more examples linked from the readme.md

@IonicaBizau
Copy link
Owner

@sahin Maybe just adding the examples in the example file. My README.md generator takes that file and puts the content in the readme file automagically. Maintaining a wiki page would be hard to maintain since I generate everything. 😁

@sahin
Copy link
Author

sahin commented May 9, 2016

I added scrape it here

https://github.com/sahin/status-page/tree/master/_servicesWillBeAdded/_HowToAddaService

"2) Create a file called Parse.js (example check Statusio.js) for parsing I suggest cheerio, jsdom or scrape-it"

:)

if you star the repo too, it will be great.

@IonicaBizau
Copy link
Owner

@sahin Great! Nice project, but I rarely star things on GitHub. 😁

@sahin
Copy link
Author

sahin commented May 9, 2016

@IonicaBizau no no worries, we have thousands of stars in general.

check out this. how clean it is.
https://github.com/sahin/status-page/blob/master/src/lib/Parsers/Statusio.js

@IonicaBizau
Copy link
Owner

@sahin If you npm publish it, your project will appear in the README.md of this scrape-it when generating it. 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants