-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1409 importer handling of imvalid files #1437
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion to avoid side effects
@@ -113,10 +119,6 @@ public void CalculateFormat(IDataSourceFile dataSource, IReadOnlyList<ColumnInfo | |||
throw new UnsupportedFormatException(dataSource.Path); | |||
|
|||
dataSource.AvailableFormats = AvailableFormats(dataSource.DataSheets[sheetName].RawData, columnInfos, metaDataCategories).ToList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this code, we are modifying the dataSource given as parameter. Can I suggest to change this methods so that it does not have side effect and returns the availableFormats instead? Line 108 for instance . It is not clear at all that the format will be set in this method. Thougts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abdelr not sure if you saw this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abdelr not sure if you saw this
Sure, but this is still a work in progress since the last discussion on markers for the sheets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big PR with some interesting changes. I need some more time to review
src/OSPSuite.Assets/UIConstants.cs
Outdated
@@ -1275,6 +1277,9 @@ public static class Error | |||
public static readonly string NaNOnData = "Data contains NaN values at imported columns. Select a different action for NaN values or clean your data."; | |||
public static readonly string UnsupportedFileType = "The type of file that you are trying to open is not currently supported"; | |||
public static readonly string CannotRemoveBaseGridColumnStillInUse = "Cannot remove base grid column still used by other columns"; | |||
public static readonly string SimpleParseErrorMessage = "There were errors while parsing your data. Navigate to the sheets to read the concrete error."; | |||
|
|||
public static string ParseErrorMessage(IEnumerable<string> errors) => "There were errors while parsing your data: " + string.Join(". ", errors); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you could use string interpolation and not the old school +
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change in ospsuite.Dimensions is suspicious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see wha't's happening and this is ok. We are breaking a lot of encapsulation but passing Cache back and forth .
Cache is great because you can define a way to retrieve keys automatically by specifing a lambda. We are never using it. WE also never know what the key is. Is it the sheetName sometimes? what's a DataSet compared to a Sheet. Why are we adding DataSet with the sheet key as key..
Lots of questions for me...also coming from older code. We need to run resharper on those files because some basic optimization and code clean up are not done
@@ -16,15 +19,15 @@ public interface IDataSource | |||
{ | |||
void SetDataFormat(IDataFormat dataFormat); | |||
void SetNamingConvention(string namingConvention); | |||
void AddSheets(Cache<string, DataSheet> dataSheets, IReadOnlyList<ColumnInfo> columnInfos, string filter); | |||
Cache<IDataSet, List<ParseErrorDescription>> AddSheets(Cache<string, DataSheet> dataSheets, IReadOnlyList<ColumnInfo> columnInfos, string filter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unexpected signature. What does this returned cache represent? .Can you comment the interface so that we know what's going on?
In general, returning a cache is not commun weird because you are exposing the API to the consumer but it might be required
void SetMappings(string fileName, IEnumerable<MetaDataMappingConverter> mappings); | ||
ImporterConfiguration GetImporterConfiguration(); | ||
IEnumerable<MetaDataMappingConverter> GetMappings(); | ||
Cache<string, IDataSet> DataSets { get; } | ||
IEnumerable<string> NamesFromConvention(); | ||
NanSettings NanSettings { get; set; } | ||
ImportedDataSet DataSetAt(int index); | ||
void ValidateDataSourceUnits(IReadOnlyList<ColumnInfo> columnInfos); | ||
Cache<IDataSet, List<ParseErrorDescription>> ValidateDataSourceUnits(IReadOnlyList<ColumnInfo> columnInfos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same with this one.
} | ||
} | ||
|
||
public class MismatchingArrayLengthsParseErrorDescription : ParseErrorDescription |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have all those classes instead of using the string? associated with the message?
{ | ||
public EmptyDataSetsParseErrorDescription(IEnumerable<string> dataSetNames) | ||
{ | ||
Message = Error.EmptyDataSet($"{string.Join(", ", dataSetNames)}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use dataSetNames.ToString(", ")
} | ||
|
||
public static void Add<T1, T2>(Cache<T1, List<T2>> cache, T1 key, IEnumerable<T2> list) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make in an extension method (Call this CachedListExtensions) and the first one should call the other instead of duplicating.
{ | ||
if (_data.Any(dataSet => dataSet.Data.Any(pair => pair.Key.ColumnInfo.IsMandatory && pair.Value.Any(point => point.Measurement == indicator || double.IsNaN(point.Measurement))))) | ||
throw new NanException(); | ||
return _data.Any(dataSet => dataSet.Data.Any(pair => pair.Key.ColumnInfo.IsMandatory && pair.Value.Any(point => point.Measurement == indicator || double.IsNaN(point.Measurement)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we comment a bit on this line ...please. And maybe we can break the logic a bit here because it's very complicated to decipher
Also run resharper on this file
foreach(var x in dataSource.DataSheets.Keys) | ||
{ | ||
dataSource.AvailableFormats = CalculateFormat(dataSource, columnInfos, metaDataCategories, x).ToList(); | ||
if (dataSource.AvailableFormats?.Count > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can it be nulll if you call ToList before?
throw new UnsupportedFormatException(dataSource.Path); | ||
|
||
dataSource.Format = dataSource.AvailableFormats.FirstOrDefault(); | ||
return AvailableFormats(dataSource.DataSheets[sheetName].RawData, columnInfos, metaDataCategories); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could return a IReadOnly so that you don't have to TOList earlier.
@@ -16,15 +19,15 @@ public interface IDataSource | |||
{ | |||
void SetDataFormat(IDataFormat dataFormat); | |||
void SetNamingConvention(string namingConvention); | |||
void AddSheets(Cache<string, DataSheet> dataSheets, IReadOnlyList<ColumnInfo> columnInfos, string filter); | |||
Cache<IDataSet, List<ParseErrorDescription>> AddSheets(Cache<string, DataSheet> dataSheets, IReadOnlyList<ColumnInfo> columnInfos, string filter); | |||
void SetMappings(string fileName, IEnumerable<MetaDataMappingConverter> mappings); | |||
ImporterConfiguration GetImporterConfiguration(); | |||
IEnumerable<MetaDataMappingConverter> GetMappings(); | |||
Cache<string, IDataSet> DataSets { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this as a cache exposed. DataSet are added outside of the DataSource as it stands
_importer.AddFromFile(_configuration.Format, filterSheets(dataSheets, filter), columnInfos, this); | ||
if (NanSettings == null || !double.TryParse(NanSettings.Indicator, out var indicator)) | ||
indicator = double.NaN; | ||
var errors = new Cache<IDataSet, List<ParseErrorDescription>>(); | ||
foreach (var dataSet in DataSets.KeyValues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you iterating over the keyValues? You are never using the Key
@msevestre I encapsulated the errors on a new class. I hope it makes more sense now? |
} | ||
|
||
void IDataSource.ValidateDataSourceUnits(IReadOnlyList<ColumnInfo> columnInfos) | ||
ParseErrors IDataSource.ValidateDataSourceUnits(IReadOnlyList<ColumnInfo> columnInfos) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the IDataSource. here?
|
||
namespace OSPSuite.Infrastructure.Import.Core.Helpers | ||
{ | ||
static public class CachedListHelpers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be removed right?
if (_data.Any(dataSet => dataSet.Data.Any(pair => pair.Key.ColumnInfo.IsMandatory && pair.Value.Any(point => point.Measurement == indicator || double.IsNaN(point.Measurement))))) | ||
throw new NanException(); | ||
//returns true if for any ParsedDataSet | ||
return _data.Any( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YES!!!!
@@ -0,0 +1,3 @@ | |||
DevExpress.XtraEditors.CheckEdit, DevExpress.XtraEditors.v21.2, Version=21.2.3.0, Culture=neutral, PublicKeyToken=b88d1754d700e49a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this file
@@ -222,13 +228,12 @@ public string GetActiveFilterCriteria() | |||
|
|||
public void AddTabs(List<string> sheetNames) | |||
{ | |||
//we should seek an alternative | |||
importerTabControl.Images = imageCollection1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename this control
if (_lastLoadedDataSets.Contains(SelectedTab) && _lastErrors.Contains(_lastLoadedDataSets[SelectedTab])) | ||
{ | ||
labelControlError.Text = Error.ParseErrorMessage(_lastErrors.ErrorsFor(_lastLoadedDataSets[SelectedTab]).Select(x => x.Message)); | ||
layoutControlItem2.Visibility = DevExpress.XtraLayout.Utils.LayoutVisibility.Always; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename layoutControlItem2
Also use resharper to simplify some of the name space stuff
<value>163, 17</value> | ||
</metadata> | ||
<assembly alias="DevExpress.Utils.v21.2" name="DevExpress.Utils.v21.2, Version=21.2.3.0, Culture=neutral, PublicKeyToken=b88d1754d700e49a" /> | ||
<data name="imageCollection1.ImageStream" type="DevExpress.Utils.ImageCollectionStreamer, DevExpress.Utils.v21.2" mimetype="application/x-microsoft.net.object.bytearray.base64"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not use images like that at all. Icons are added to the ImageList and are loaded all dynamically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The domain codes is SO Much better. Thanks
The UI code needs to be updated :
Some controls have default naming with 1 at the end
Not proper usage of icons
private void refreshErrorMessage() | ||
{ | ||
|
||
if (_lastLoadedDataSets.Contains(SelectedTab) && _lastErrors.Contains(_lastLoadedDataSets[SelectedTab])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
somehow this feels like this code belongs in the presenter.
{ | ||
importerTabControl.TabPages.Each(x => | ||
{ | ||
if (_lastLoadedDataSets.Contains(x.Text)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this codes for which image to show also belongs in the presenter
And hopefully, we won't have those magin number anymore 1 and 0 once we use ApplicationIcons
<value>17, 17</value> | ||
</metadata> | ||
<data name="useForImportCheckEdit.ToolTip" xml:space="preserve"> | ||
<value>When selected, the filter will apply to the data during the import process. When deselected, the filter only affects this view. Check documentation for more information on defining filters: <href=https://docs.open-systems-pharmacology.org/shared-tools-and-example-workflows/features-of-tables#filtering>https://docs.open-systems-pharmacology.org/shared-tools-and-example-workflows/features-of-tables#filtering</href></value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tootip needs to be put into constants!
var accumulatedIndexes = 0; | ||
while (sheet.MoveNext() && index >= 0) | ||
{ | ||
if (sheet.Current.Data.Count() > index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(sheet.Current.Data.Count() 3 times in the same while loop. Maybe cache it?
@msevestre Only open PR. Just adding you to the comment to make sure you know this is ready for revision. |
@@ -117,7 +155,27 @@ public IEnumerable<string> NamesFromConvention() | |||
return _importer.NamesFromConvention(_configuration.NamingConventions, _configuration.FileName, DataSets, _mappings); | |||
} | |||
|
|||
public ImportedDataSet DataSetAt(int index) | |||
public IDataSet DataSetAt(int index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's talk about this in the meeting. I don't understand what this is doing.
I think this is working as expected. I have my concerns regarding the deeply nested structure in DataSource.cs |
Fixes #1409