Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: class 'crawl4ai.models.CrawlResult' object has no attribute 'raw_markdown' #719

Open
barzan-hayati opened this issue Feb 19, 2025 · 0 comments
Labels
🐞 Bug Something isn't working 🩺 Needs Triage Needs attention of maintainers

Comments

@barzan-hayati
Copy link

crawl4ai version

0.4.248

Expected Behavior

Hello. Thanks for your great free library.

I'm trying to run this sample code

from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
from crawl4ai.content_filter_strategy import PruningContentFilter
from crawl4ai.markdown_generation_strategy import DefaultMarkdownGenerator

md_generator = DefaultMarkdownGenerator(
    content_filter=PruningContentFilter(threshold=0.4, threshold_type="fixed")
)

config = CrawlerRunConfig(
    cache_mode=CacheMode.BYPASS,
    markdown_generator=md_generator
)

async with AsyncWebCrawler() as crawler:
    result = await crawler.arun("https://news.ycombinator.com", config=config)
    print("Raw Markdown length:", len(result.markdown.raw_markdown))
    print("Fit Markdown length:", len(result.markdown.fit_markdown))

in order to run it in google colab I do some changes as bellow:

from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
from crawl4ai.content_filter_strategy import PruningContentFilter
from crawl4ai.markdown_generation_strategy import DefaultMarkdownGenerator

async def main():
    md_generator = DefaultMarkdownGenerator(
        content_filter=PruningContentFilter(threshold=0.4, threshold_type="fixed")
    )

    config = CrawlerRunConfig(
        cache_mode=CacheMode.BYPASS,
        markdown_generator=md_generator
    )

    async with AsyncWebCrawler() as crawler:
        result = await crawler.arun("https://milvus.io/api-reference/restful/v2.4.x/v2/Collection%20(v2)/Create.md", config=config)
        print(result.fit_markdown)
        print("Raw Markdown length:", len(result.markdown.raw_markdown))


if __name__ == "__main__":
    await main()

I could successfully run result.fit_markdown but for raw_markdown fall into error. As its obvious I print result.fit_markdown instead of result.markdown.fit_markdown but I could not find anything for raw_markdown.

Thanks in advance

Current Behavior

I get this error:

AttributeError: class 'crawl4ai.models.CrawlResult' object has no attribute 'raw_markdown''

Is this reproducible?

Yes

Inputs Causing the Bug

sample from quick start page: https://crawl4ai.com/mkdocs/core/quickstart/

Steps to Reproduce

Code snippets

OS

Linux

Python version

3.11.11

Browser

Firefox

Browser version

134.0.2

Error logs & Screenshots (if applicable)

No response

@barzan-hayati barzan-hayati added 🐞 Bug Something isn't working 🩺 Needs Triage Needs attention of maintainers labels Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 Bug Something isn't working 🩺 Needs Triage Needs attention of maintainers
Projects
None yet
Development

No branches or pull requests

1 participant