Skip to content

karatekaneen/avanza-scraper

Repository files navigation

Table of Contents

AvanzaScraper

This app scrapes all the stocks from Avanzas page of stock lists. It extracts the id and linkName properties which are needed to fetch the price data later on. It also grabs the name and which list the stock belongs to.

Usage

For basic usage and scrape the default lists it is only needed to run index.js.

createScrapeStocks

Factory function for the scraper.

Parameters

  • deps Object
    • deps.puppeteer Function The puppeteer library (optional, default require('puppeteer'))
    • deps.siteActions Object Collection of helper functions to extract data from the page (optional, default require('./siteActions'))
    • deps.sleep Function Sleep function (optional, default require('./helpers').sleep)

scrapeStocks

This is the main scrapeStocks function. It loads the page, selects what lists to take the stocks from and extracts the data from the table.

Parameters

  • params Object
    • params.listsToSave Array<String> The lists we want to save (optional, default [])
    • params.url String Url to the page we want to scrape (optional, default 'https://www.avanza.se/aktier/lista.html')
    • params.headless Boolean Run the browser in headless mode or not (optional, default true)
    • params.sleepTime Number The time to sleep inbetween trying to load more data. Increase this if all data isn't being loaded before moving forward. (optional, default 1000)

Returns Array<Object> Array with all the scraped data

openListMenu

Opens the menu on top of the page to select what lists of stocks that should be included in the table.

Returns void

selectActiveLists

Select what lists to scrape the data for and click the selector for that one.

Parameters

  • lists Array<String> The names of the lists we want to scrape. - Case sensitive for now

Returns void

showMoreStocks

Click the "show more" button to load the remaining stocks. When there is no more stocks to load the display-property of the button is changed to "none" so we check if it's still "inline-block" and if so click it.

Returns void

extractTableData

Extracts the data from the table of stocks.

Returns Array<Object> all the stocks in the list

sleep

Take a timeout to wait for content to load

Parameters

  • milliseconds Number number of milliseconds to wait

Returns Promise<void>

dateToKey

Takes a two-dimensional array that looks like:

const dataArr = [
	[
		15151651351351, // Stringified date
		432143543 // Some data
	]
]

and returns an object:

const output = {
	"15151651351351": {
		date: 15151651351351,
		data: 432143543
	}
}

Doing it this way prevents the search algo from going full O(N^2) and instead O(2N).

Parameters

Returns Object Object with the date as key

createQueue

Executes an array of Promises with a maximum of simultaneous requests

Parameters

  • tasks Array<Promise> Tasks to be performed. Must be promise-based.
  • maxNumOfWorkers number Default = 8. Maximum amount of simultaneous open requests (optional, default 4)

createSaveStockList

Factory function for saveStockList

Parameters

  • deps Object
    • deps.Firestore Object The Firestore Lib (optional, default require('@google-cloud/firestore'))
    • deps.credentials Object credentials for the database (optional, default require('../../secrets.json'))

Returns Function saveStockList

createSavePricesToStock

Factory function for savePricesToStock.

Parameters

  • deps Object
    • deps.Firestore Object The Firestore Lib (optional, default require('@google-cloud/firestore'))
    • deps.credentials Object credentials for the database (optional, default require('../../secrets.json'))

Returns Function savePricesToStock

saveStockList

This function takes an array and saves them to the database in batches.

Parameters

  • stocks Array<Object> Array of stocks to be saved in the database.

Returns Array<Object> The original array of stocks

savePricesToStock

This function adds the price data to the stock document in the database.

Parameters

  • params Object
    • params.id Number Id of the stock to update
    • params.priceData Array<Object> The data to add to the document
    • params.name String The name, only used for logging

Returns Object Reference to the database document.

priceScraper

This is the main function that handles the workflow of fetching the price data and save it to the stock in the db.

TODO Error handling

Parameters

  • params Object
    • params.stocks Array<Object> Array of stocks to fetch price data about
    • params.settings Object Settings about the data fetching
      • params.settings.start Date First date to include in the data
      • params.settings.end Date Last date to include in the data
      • params.settings.maxNumOfWorkers Object Max of simultaneous open requests

Returns void

createFetchPriceData

Factory function for fetchPriceData

Parameters

  • deps Object
    • deps.fetch Function node-fetch library (optional, default require('node-fetch'))
    • deps.parsePriceData Function (optional, default require('../data/dataParser').createParsePriceData({}))

Returns Function fetchPriceData

createPriceScraper

Factory function for priceScraper

Parameters

  • deps Object
    • deps.fetchPriceData Function Function to download data (optional, default this.createFetchPriceData({}))
    • deps.savePricesToStock Function Database handler (optional, default require('../data/dataSaver').createSavePricesToStock({}))
    • deps.createQueue Function Limiting the amount of open requests (optional, default require('./helpers').createQueue)

Returns Function priceScraper

fetchPriceData

Function to fetch price, volume and owner data from Avanza.

Parameters

  • params Object
    • params.orderbookId Number id of the stock to fetch
    • params.start Date First date to fetch - this day will be included (optional, default new Date('2019-01-01T22:00:00.000Z'))
    • params.end Date Last date to fetch this day will be included (optional, default new Date())

Returns Array<Object> Price data for the given stock

createParsePriceData

Factory function for parcePriceData

Parameters

  • deps Object
    • deps.dateToKey Function Assings the date as the object key to avoid O(N^2) and instead get O(N*2) (optional, default require('../scraping/helpers').dateToKey)

Returns Function parsePriceData

parsePriceData

This parses the data from the Avanza API to proper objects. Output object looks like:

const output = [
	{
		open: 123,
		high: 125,
		low: 121,
		close: 123,
		volume: 374298, // # Stocks traded
		owners: 4200 // # owners on Avanza
	}
]

Parameters

  • json Object The price data from the API response
    • json.dataPoints Array<Array> The array of price data originally provided,
    • json.volumePoints Array<Array> Array of date and volume traded on that day
    • json.ownersPoints Array<Array> Array of date and number of owners on Avanza on that date

Returns Array<Object> The parsed price data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages