-
Notifications
You must be signed in to change notification settings - Fork 18
Prevent of bolding entire content pasted from google docs #62
Conversation
…pboard event. Extend unit test with information about data source. Some small docs and test naming improvements.
…tType method, remove dataSource from autoamtic tests which was confusing.
@Mgsy maybe you will have a moment to click over this PR and find some strangely defined Google Docs document which still will be pasted with bold. Just please be aware that PR fixes only the situation when the entire pasted text became bolded. Support for the basic styles, lists and other feature will be introduced later in some following PRs. |
Unfortunately, this fix doesn't work on Windows. |
@Mgsy can you take a look one more time. It should work now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was reviewing this for some time an tried to remove the need for static methods - they are artificial IMO for test purposes only and I think that we can spend some more time to clean some things here a bit. And probably make PFO more maintainable for the future (pasting from other office suites or editor types - ie excel).
In order to start this, we need to know the API for that - namely, what must be passed do the normalize method? In the _inputTransformationListener()
method there's either html
& dataTransfer
passed or data.content
. This part needs to be a bit unified - it would be nice to have at most 2 params (where the first in the data which need to be processed and the latter additional data (like dataTransfer
for Word).
I'm thinking about private API entirely for now - just to refactor the switch and make the PFO tastable better.
The basic idea is to create an interface - from what I see ATM something basic like:
interface Normalizer {
isActive( html ); // bad name - anyway t
normalize( html, dataTransfer )
}
This way we will be able to:
- do not expose private methods for testing (ie by creating stub with the same API as Normalizer that will check if proper normalizer is called and if it is called only once)
- test normalizers independently (or just as integration test as now - doesn't really matter)
src/pastefromoffice.js
Outdated
* Listener fired during {@link module:clipboard/clipboard~Clipboard#event:inputTransformation `inputTransformation` event}. | ||
* Detects if content comes from a recognized source and normalize it. | ||
* | ||
* **Note**: this function was exposed mainly for testing purposes and should not be called directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* **Note**: this function was exposed mainly for testing purposes and should not be called directly. | |
* **Note**: this function is exposed mainly for testing purposes and should not be called directly. |
src/pastefromoffice.js
Outdated
data.content = PasteFromOffice._normalizeWordInput( html, data.dataTransfer ); | ||
break; | ||
case 'gdocs': | ||
data.content = PasteFromOffice._normalizeGoogleDocsInput( data.content ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is here data.content
taken directly and not the html
as for word input? At least some explanation is needed.
tests/_utils/utils.js
Outdated
} ) | ||
}; | ||
|
||
PasteFromOffice._inputTransformationListener( null, data ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 can be used as I checked that:
editor.plugins.get( 'Clipboard' ).fire( 'inputTransformation', data );
src/pastefromoffice.js
Outdated
* @param {module:utils/eventinfo~EventInfo} evt | ||
* @param {Object} data same structure like {@link module:clipboard/clipboard~Clipboard#event:inputTransformation input transformation} | ||
*/ | ||
static _inputTransformationListener( evt, data ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, we do not use such static methods. Anyway, this method was exported only for test purposes and wasn't needed to be exposed at all, so let's try to fix this (check this: https://github.com/ckeditor/ckeditor5-paste-from-office/pull/62/files#diff-01566d953bd66051289510b52b899e4bR166)
…ct them to separate files. Clean up in contentnormalizer api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Since there has been quite a few changes, @Mgsy can I ask you to verify once again the fix? |
@Mgsy please also check if nothing went wrong with MS Word, as there was also changed way how MS Word filters start to be fired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're almost there ;) Please take a look at the comments, most importantly:
- There is too much
[].forEach()
IMO used in test - some of them are redundant and prevents us from writing a clear explanation of what this particular test tests other than case #. - Some corrections to the docs are needed.
- The code to be moved around (namespaces names).
I' malso thinking about common API for filters but I think that we can live with a current state of things. II'll create a follow up for that (or update a current one if existing)>
src/filters/common.js
Outdated
*/ | ||
|
||
/** | ||
* @module paste-from-office/filters/common |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is a common filter - so maybe we should just move it to removeboldtagwrapper.js
.
src/filters/common.js
Outdated
*/ | ||
export function removeBoldTagWrapper( { documentFragment, writer } ) { | ||
for ( const childWithWrapper of documentFragment.getChildren() ) { | ||
if ( childWithWrapper.is( 'b' ) && childWithWrapper.getStyle( 'font-weight' ) === 'normal' ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
child
would be enough - better to read here :)
src/filters/common.js
Outdated
for ( const childWithWrapper of documentFragment.getChildren() ) { | ||
if ( childWithWrapper.is( 'b' ) && childWithWrapper.getStyle( 'font-weight' ) === 'normal' ) { | ||
const childIndex = documentFragment.getChildIndex( childWithWrapper ); | ||
const removedElement = writer.remove( childWithWrapper )[ 0 ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid such constructs if possible:
writer.remove( child );
writer.insertChild( index, child.getChildren(), docuemntFragment );
also will work.
src/filters/common.js
Outdated
/** | ||
* Removes `<b>` tag wrapper added by Google Docs to a copied content. | ||
* | ||
* @param {module:engine/view/documentfragment~DocumentFragment} documentFragment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong parameters in the docs.
src/normalizer.jsdoc
Outdated
/** | ||
* Method applies normalization to given data. | ||
* | ||
* @method #exec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might forgot about it - it should be full form: execute()
tests/pastefromoffice.js
Outdated
{ | ||
'text/html': '<meta name=Generator content="Microsoft Word 15"><p class="MsoNormal">Hello world<o:p></o:p></p>' | ||
} | ||
].forEach( ( inputData, index ) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above (helper test function vs forEach()
tests/normalizer/mswordnormalizer.js
Outdated
describe( 'isActive()', () => { | ||
describe( 'correct data set', () => { | ||
[ | ||
'<meta name=Generator content="Microsoft Word 15"><p>Foo bar</p>', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not detect the other option - only one form of compatible content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compare readability of the tests:
describe( 'isActive()', () => {
it( 'should return true for MS Word content', () => {
expect( normalizer.isActive( '<meta name=Generator content="Microsoft Word 15"><p>Foo bar</p>' ) ).to.be.true;
} );
it( 'should return true for MS Word content - in Safari', () => {
expect( normalizer.isActive( '<meta name=Generator content="Microsoft Word 15"><p>Foo bar</p>' ) ).to.be.true;
} );
it( 'should return false for non-compatible content', () => {
expect( normalizer.isActive( '<p>Foo bar</p>' ) ).to.be.false;
} );
it( 'should return false for content from other source', () => {
expect( normalizer.isActive( '<p id="docs-internal-guid-12345678-1234-1234-1234-1234567890ab"></p>' ) ).to.be.false;
} );
} );
marking every test with numbers provides little value for others - I have to check what the #
was and why it might fail.
Those tests are simple enough and doesn't have to be run in a loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not detect the other option - only one form of compatible content.
It wasn't present such check in original data. Now it's added :)
const normalizer = new GoogleDocsNormalizer(); | ||
|
||
describe( 'isActive()', () => { | ||
describe( 'correct data set', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the notes about running tests in MSWordNormalizer
tests.
Co-Authored-By: Maciej <jodator@jodator.net>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked it once again and everything works fine. I didn't find any new unexpected behaviour regarding pasting from Word.
tests/pastefromoffice.js
Outdated
// @param {Boolean} shouldBeProcessed determines if data should be marked as processed with isTransformedWithPasteFromOffice flag | ||
// @param {Boolean} [isAlreadyProcessed=false] apply flag before paste from office plugin will transform the data object | ||
function checkDataProcessing( inputString, shouldBeProcessed, isAlreadyProcessed = false ) { | ||
// const htmlDataProcessor = new HtmlDataProcessor(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove - left over comment.
src/filters/removeboldwrapper.js
Outdated
*/ | ||
|
||
/** | ||
* @module paste-from-office/filters/removeboldtagwrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong module = should be removeboldwrapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for two out-of-order comments I forgot to start a review with them.
Finishing touches and we are good to go :)
tests/pastefromoffice.js
Outdated
|
||
clipboard.fire( 'inputTransformation', data ); | ||
|
||
if ( shouldBeProcessed ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those two methods of testing could be preserved - now the test is a bit too mangled ;) Also two booleans in method parameters are too much.
Suggested merge commit message (convention)
Feature: Prevent of bolding entire content pasted from google docs. Closes ckeditor/ckeditor5#2491 .
Additional information