-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganise variable processing #1189
Reorganise variable processing #1189
Conversation
d7da85d
to
1635126
Compare
1635126
to
aee17fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some annotations:
211bc3f
to
5a974ef
Compare
72eabad
to
b19dcc6
Compare
ea2ab6b
to
d362fb8
Compare
616ea6e
to
774a270
Compare
774a270
to
7226375
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marvinchin I shelved making parser
an instance variable as I realised there's some benefits to that while adding new plugin hooks in an unrelated pr.
Otherwise this is still quite a large PR, with 2 logical commits (the first commit makes more sense with the second though). Let me know if you prefer to review them separately!
*The number of loc in this PR increased quite a bit, but due to the very heavy documentation and extra blank lines (for better logical separation) / etc I added
Some annotations otherwise (ignore the resolved comments):
src/Page.js
Outdated
const userDefinedVariables = this.userDefinedVariablesMap[parentSite]; | ||
return `${pageData}\n${njUtil.renderRaw(footerContent, userDefinedVariables)}`; | ||
|
||
return `${pageData}\n${this.variablePreprocessor.renderSiteVariables(this.sourcePath, footerContent)}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example of site variable processing boilerplate reduction (the second arrow), similar ones follow throughout in page.js
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interpolation looks a bit complicated. Should we split this out into two separate steps?
(i.e. first call renderSiteVariables
and store the result in a variable, then do the interpolation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! (for the above header content as well)
// and let the baseUrl value be injected later. | ||
userDefinedVariables.baseUrl = '{{baseUrl}}'; | ||
this.userDefinedVariablesMap[base] = userDefinedVariables; | ||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example of arrow number 1 in the diagram (site variable extraction), similar ones follow in this file
this.dynamicIncludeSrc = []; | ||
this.staticIncludeSrc = []; | ||
this.missingIncludeSrc = []; | ||
} | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the methods relating to page variable extraction and rendering are completely changed (there's a high level overview of the changes in the original post).
Do let me know if you prefer me to point out the characteristics (as per the overview) in the original code!
...additionalVariables, | ||
...innerVariables, | ||
}); | ||
const renderedContent = this.variablePreprocessor.renderPage(file, content, additionalVariables); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huge boilerplate reduction, due to the process overhaul and method extraction.
The left shows the lengthy 2 step process as mentioned in the overview in the op.
This is also the first of 2 methods belonging to function 4 I changed. (the second being renderIncludeFile
(annotated below shortly))
const { | ||
renderedContent, | ||
childContext, | ||
} = variablePreprocessor.renderIncludeFile(actualFilePath, element, context, filePath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar boilerplate reduction here, and also the second method of function 4.
The difference here (and also per the original code) is that the child context is returned which is necessary to render nested includes with the parent include's <variable>
s wrapped by <include>
s.(https://markbind.org/userGuide/reusingContents.html#specifying-variables-in-an-include)
The 'page' (included content) is also rendered with the extracted variables wrapped inside the include
@@ -318,6 +315,16 @@ function _preprocessVariables() { | |||
return utils.createEmptyNode(); | |||
} | |||
|
|||
function _preprocessImports(node, parser) { | |||
if (node.attribs.from) { | |||
parser.staticIncludeSrc.push({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously this is done inside the variable processing methods, hence variablePreprocessor
needs to know of parser
. I moved it here to remove that dependency
* which have greater priority than the extracted include variables of the current context. | ||
* @param asIfAt where the included file should be rendered from | ||
*/ | ||
renderIncludeFile(filePath, node, context, asIfAt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gist of the revamp is summarised in this method and the one below as mentioned.
The main highlights are:
- the process is made pure - no static
Parser
variables are needed any longer - no hacky workarounds like
Proxy
due to removal of the 2-step extraction/rendering into one - better performance 😁
The differences in this and render page are:
- we need to render
<include>
files with the<include **var-xx="..."**>**<variable>...</variable>**</include>
s - the child context is returned for processing of nested
<include>
s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work trying to untangle the parser! I think separating variable processing into its own class is a step in the right direction in trying to simplify the parser. There seems to be a bug regarding the included variables though! I've also left some other comments and questions.
The new approach makes sense to me, and I think it should work. I'm not super familiar with the inner workings of the parser, so it might be good to have @yamgent or @acjh take a closer look at this as well since it's a critical part of markbind.
src/Page.js
Outdated
const userDefinedVariables = this.userDefinedVariablesMap[parentSite]; | ||
return `${pageData}\n${njUtil.renderRaw(footerContent, userDefinedVariables)}`; | ||
|
||
return `${pageData}\n${this.variablePreprocessor.renderSiteVariables(this.sourcePath, footerContent)}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interpolation looks a bit complicated. Should we split this out into two separate steps?
(i.e. first call renderSiteVariables
and store the result in a variable, then do the interpolation)
} | ||
|
||
/** | ||
* Renders the variable in addition to adding it, unlike {@link addUserDefinedVariable}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should provide an explanation of why this is useful (I'm guessing this is for the same reason as L562 of Site.js
).
Also, I think renderAndAddUserDefinedVariable
might be more accurate 😛 It'd be great if we can find a function name that more accurately reflects the intention of what it's supposed to do, but I can't seem to think of one just yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done 😃
* @param contentFilePath to look up. | ||
* @return {*} The appropriate (closest upwards) site variables map | ||
*/ | ||
getContentParentSiteVariables(contentFilePath) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards naming the function getParentSiteVariables(contentFilePath)
for simplicity
* @param lowerPriorityVariables than the site variables, if any | ||
* @param higherPriorityVariables than the site variables, if any | ||
*/ | ||
renderSiteVariables(contentFilePath, content, lowerPriorityVariables = {}, higherPriorityVariables = {}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we name lowerPriorityVariables
-> variableDefaults
and higherPriorityVariables
-> variableOverrides
to reflect our intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not so much variableDefaults
; it really depends as we have quite a few types of variables which order needs to be respected, and quite a few use cases. I think leaving it as lower/higherPriority and leaving the reader to see where this is called is better.
e.g. when used in rendering the site nav there is no defaults/overrides
to speak of,
when used in extracting page variables defaultVariables
becomes { ...importedPageVariables, ...pageVariables, ...includeVariables }
(that have been so far collected to add) which miscommunicates the intention of defaultVariables
.
}, {}, false); | ||
}; | ||
|
||
$('variable').each((index, node) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the loading of variable from another file can be abstracted to another function (ignore if this method is revamped in part 2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup its revamped
src/Site.js
Outdated
@@ -688,7 +688,6 @@ class Site { | |||
const filePathArray = Array.isArray(filePaths) ? filePaths : [filePaths]; | |||
const uniquePaths = _.uniq(filePathArray); | |||
logger.info('Rebuilding affected source files'); | |||
this.variablePreprocessor.resetVariables(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we no longer need to reset the variables? What if a variable has been deleted? Would the old value associated with the variable also be cleared when rebuilding the page?
* @param pageImportedVariables object to add the extracted imported variables to | ||
* @param elem "dom node" of the <import> element as parsed by htmlparser2 | ||
* @param filePath that the <variable> tag is from | ||
* @param renderVariable callback to render the 'from' attribute with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename renderVariable
-> renderFrom
to make it clearer?
logger.warn(`Error ${err.message}`); | ||
} | ||
// NOTE: Selecting both at once is important to respect variable/import declaration order | ||
$('variable, import[from]').not('include > import').each((index, elem) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be include > variable
instead, to avoid recognizing include variables as page variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this the bug mentioned? yes it should be (I derped this) 😢 thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to add, the .not('include > variable')
was actually not present before (found this bug while refactoring).
I included here since it just requires a minor addition though.
The below test I included for this ensures this, though I misread Included Variable Overriding Page Variable
in the expected test files 😅 😅
@@ -691,7 +688,6 @@ class Site { | |||
const filePathArray = Array.isArray(filePaths) ? filePaths : [filePaths]; | |||
const uniquePaths = _.uniq(filePathArray); | |||
logger.info('Rebuilding affected source files'); | |||
MarkBind.resetVariables(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we no longer need to reset the variables? What if a variable has been deleted? Would the old value associated with the variable also be cleared when rebuilding the page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, the process is made pure, while maintaining modest ~10% performance improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add, the process before essentially involved 2x extraction and 2x rendering, storing the intermediate extracted variables in Parser
's static variables.
The process is now pure since its just one step, so we have no need for that.
} = require('./constants'); | ||
|
||
class Parser { | ||
constructor(options) { | ||
this._options = options || {}; | ||
this._fileCache = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like our new implementation no longer caches included files. Should we add this to the variablePreprocessor
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this for now as the performance benefits are quite minor for the added code (this pr improves performance overall even with this removed as a point); also investigated here: #138
The file cache was also only used in the renderIncludeFile
method in the old code - which means other areas like Page
reading its initial source file dosen't benefit from this.
I think the right way to reintroduce this would be a FileCache
abstraction of sort to the whole code - only then would the performance improvements be substantial enough to warrant it.
7226375
to
fe1ca23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review on this rather large pr 😅! @marvinchin
@@ -691,7 +688,6 @@ class Site { | |||
const filePathArray = Array.isArray(filePaths) ? filePaths : [filePaths]; | |||
const uniquePaths = _.uniq(filePathArray); | |||
logger.info('Rebuilding affected source files'); | |||
MarkBind.resetVariables(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, the process is made pure, while maintaining modest ~10% performance improvements.
src/Page.js
Outdated
const userDefinedVariables = this.userDefinedVariablesMap[parentSite]; | ||
return `${pageData}\n${njUtil.renderRaw(footerContent, userDefinedVariables)}`; | ||
|
||
return `${pageData}\n${this.variablePreprocessor.renderSiteVariables(this.sourcePath, footerContent)}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! (for the above header content as well)
} = require('./constants'); | ||
|
||
class Parser { | ||
constructor(options) { | ||
this._options = options || {}; | ||
this._fileCache = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this for now as the performance benefits are quite minor for the added code (this pr improves performance overall even with this removed as a point); also investigated here: #138
The file cache was also only used in the renderIncludeFile
method in the old code - which means other areas like Page
reading its initial source file dosen't benefit from this.
I think the right way to reintroduce this would be a FileCache
abstraction of sort to the whole code - only then would the performance improvements be substantial enough to warrant it.
} | ||
|
||
/** | ||
* Renders the variable in addition to adding it, unlike {@link addUserDefinedVariable}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done 😃
* @param lowerPriorityVariables than the site variables, if any | ||
* @param higherPriorityVariables than the site variables, if any | ||
*/ | ||
renderSiteVariables(contentFilePath, content, lowerPriorityVariables = {}, higherPriorityVariables = {}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not so much variableDefaults
; it really depends as we have quite a few types of variables which order needs to be respected, and quite a few use cases. I think leaving it as lower/higherPriority and leaving the reader to see where this is called is better.
e.g. when used in rendering the site nav there is no defaults/overrides
to speak of,
when used in extracting page variables defaultVariables
becomes { ...importedPageVariables, ...pageVariables, ...includeVariables }
(that have been so far collected to add) which miscommunicates the intention of defaultVariables
.
logger.warn(`Error ${err.message}`); | ||
} | ||
// NOTE: Selecting both at once is important to respect variable/import declaration order | ||
$('variable, import[from]').not('include > import').each((index, elem) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this the bug mentioned? yes it should be (I derped this) 😢 thanks!
logger.warn(`Error ${err.message}`); | ||
} | ||
// NOTE: Selecting both at once is important to respect variable/import declaration order | ||
$('variable, import[from]').not('include > import').each((index, elem) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to add, the .not('include > variable')
was actually not present before (found this bug while refactoring).
I included here since it just requires a minor addition though.
The below test I included for this ensures this, though I misread Included Variable Overriding Page Variable
in the expected test files 😅 😅
fe1ca23
to
5249d74
Compare
Resolved a minor conflict with #1222 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems good to me! There seems to be a change in the output for testImportVariables
, but the new version seems to be correct. Do you mind taking a look and that and confirming it?
2103 site should have no diffs, save for logs and timestamps
Just to confirm, I assume you've verified this with the new changes? 🙂
@@ -6,7 +6,7 @@ | |||
<p>Test import variables that itself imports other variables:</p> | |||
<h2 id="trying-to-access-a-page-variable">Trying to access a page variable:<a class="fa fa-anchor" href="#trying-to-access-a-page-variable" onclick="event.stopPropagation()"></a></h2> | |||
<p>There should be something red below:</p> | |||
<div style="color:red">This is a page variable from <a href="http://variablestoimport.md">variablestoimport.md</a></div> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the removal of the anchor tags expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like there was a bug with our previous implementation that got variablestoimport.md
to be rendered as a link. The new version seems to be the correct one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, this is the source <variable name="pagevar">This is a page variable from variablestoimport.md</variable>
, it was there at first due to the removed line 125 of parser.js
which does md.renderInline
on it.
All md rendering is done in the render
stage, so this isn't necessary.It gets removed because in the source file, this is the code:
<div style='color:red'>{{pagevar}}</div>
by commonmark spec (since the start of the line is a <div>
) pagevar
which contains the anchor dosen't get markdown treatment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, I assume you've verified this with the new changes? 🙂
just checked again on the latest master (clean rebase below), no changes save for logs & timestamps!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll get this in then 😁
Processing variables involves a lot of boilerplate code with minor variations. This decentralizes variable processing, decreasing code maintainability and readability. Let's embark on a two stage refactor to improve variable processing. In this first refactor, we create the variablePreprocessor abstraction, and completely migrate the use of userDefinedVariablesMap for site variables to variablePreprocessor, reducing boilerplate code relating to site variables. Additionally, we move all page variable processing related methods to variablePreprocessor from Parser, abstracting such methods into smaller units of functionality where possible, and enhancing in code documentation. In the second part, we will similarly refactor page variable processing methods to reduce more boilerplate, increasing code maintainability.
Processing page variables involves a two step process. The first step extracts and renders locally declared page variables in <variable> tags, using javascript proxies to force imported variables to re-render as themselves again, which decreases code clarity. The second step then extracts the imported variables, and stores such variables into an alias map, then re-rendering the page a second time. This process can be combined into a single pass to simplify the process greatly and remove the need for workarounds like proxies. This also increases rendering performance slightly, as only a single pass is made Moreover, variable tags in <include> tags are erroneously recognised as page variables. Let's include a simple fix for this as well.
5249d74
to
d6a00b0
Compare
What is the purpose of this pull request? (put "X" next to an item, remove the rest)
• [x] Other, please explain:
Yet another part of #810
Prelude to #931, #894 type of prs - my solution for #931 added quite some boilerplate, which I don't really think is sustainable anymore -- this should be a good point to refactor it
What is the rationale for this request?
Site.js
requires the entireParser
abstraction to reset site variables during live reloadrenderIncludeFile/extractPageVariables/...
What changes did you make? (Give an overview)
variablePreprocessor
, which will be responsible for all variable storage and renderingThis doodle explains the intended endgame better (pink lines are read as 'uses'):

There are two logical parts to this refactor
The first (commit):
Completes and unifies where site variables are accessed and stored ( the first, second pink lines, and part of the fourth pink line )
parser
tovariablePreprocessor
without much change ( just some misc refactors like SLAP, etc per above )The second (commit):
Completes and logically changes functions 3 / 4.
The variable processing flow before was such that
<import>
s and locally declared<variable>
s were extracted and rendered separately. This inevitably results in more boilerplate ( e.g. note double extraction in the code above ), and uses dirty hacks likeProxy
to force nunjucks to output namespaced variables back into themselves.This commit unifies the two processes hence (see
extractPageVariables
/renderPage
), extracting and rendering both in one go while respecting the variable declaration order still. It also removes the need forProxy
s, and has some modest performance benefits due to the reduced number of passes.End goals:
Is there anything you'd like reviewers to focus on?
The main motive for this is to solve #931 type of issues easily, and hopefully make variable processing development in the future easier.
With this in mind, could the architecture be further improved?
as an example for #931, with all site variable rendering now centralised, by intended solution is to let
variablePreprocessor
hold multiple nunjucks instances in the future, each configured with its root path to be the respective root paths of the (sub)sites.This should allow any nunjucks tags that uses paths to resolve correctly across (sub)sites
Testing instructions:
Proposed commit message: (wrap lines at 72 characters)
(pending review of the commit structure)
Reorganise variable processing (part 1)
Processing variables involves a lot of boilerplate code with minor
variations. This decentralizes variable processing, decreasing code
maintainability and readability.
Let's embark on a two stage refactor to improve variable processing.
In this first refactor, we create the variablePreprocessor abstraction,
and completely migrate the use of userDefinedVariablesMap for site
variables to variablePreprocessor, reducing boilerplate code relating
to site variables.
Additionally, we move all page variable processing related methods to
variablePreprocessor from Parser, abstracting such methods into
smaller units of functionality where possible, and enhancing in code
documentation.
In the second part, we will similarly refactor page variable processing
methods to reduce more boilerplate, increasing code maintainability.