Skip to content

Commit

Permalink
[instagram] extract '__additionalDataLoaded' (#391)
Browse files Browse the repository at this point in the history
The '_sharedData' of Post pages is missing its 'graphql' part for
logged in users. This data is now included in the parameters of a
function call to '__additionalDataLoaded(...)'

And, of course, video extraction with youtube-dl broke because of
this change as well.
  • Loading branch information
mikf committed Oct 29, 2019
1 parent 5af291b commit 5fa6ff0
Showing 1 changed file with 10 additions and 2 deletions.
12 changes: 10 additions & 2 deletions gallery_dl/extractor/instagram.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,16 @@ def _request_graphql(self, variables, query_hash, csrf=None):

def _extract_shared_data(self, url):
page = self.request(url).text
data = text.extract(page, 'window._sharedData = ', ';</script>')[0]
return json.loads(data)
shared_data, pos = text.extract(
page, 'window._sharedData =', ';</script>')
additional_data, pos = text.extract(
page, 'window.__additionalDataLoaded(', ');</script>', pos)

data = json.loads(shared_data)
if additional_data:
next(iter(data['entry_data'].values()))[0] = \
json.loads(additional_data.partition(',')[2])
return data

def _extract_postpage(self, url):
shared_data = self._extract_shared_data(url)
Expand Down

0 comments on commit 5fa6ff0

Please sign in to comment.