You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have 2 frames that contain network packet arguments (source IP, destination IP and so on), one frame captured at the source, one at the destination. In my python software (now rewriting in haskell), I mapped the 2 dataframes via a hash of the packet.
Which gives:
-- generate a column with a hash of other columns
addHash :: FrameFiltered Packet -> Frame (Record '[PacketHash] )
addHash aframe =
fmap (addHash') (frame)
where
frame = fmap toHashablePacket (ffFrame aframe)
addHash' row = Col (hashWithSalt 0 row) :& RNil
-- here frame1 and frame2 have the same type
mergeTcpConnectionsFromKnownStreams frame1 frame2 =
mergedFrame
where
mergedFrame = innerJoin @'[PacketHash] ( hframe1) ( hframe2)
hframe1 = zipFrames (addHash aframe1) frame1
hframe2 = zipFrames (addHash aframe1) frame2
It compiles and it seems to run but after the innerJoin, there should be several columns with the same name. Doesn't that break the API somewhat ? how can I select the source IP between the 2 sourceIP present in the merged dataframe for instance ?
The text was updated successfully, but these errors were encountered:
I then serialized the result via writeCsv and because of #155 the results are messed up so I can't interpret them yet but I see column with the same names.
I also noticed a bug on my side: I was doing a join on the PacketHash column but all my hashes were equal to 0 (now fixed).
All packets got paired 1 to 1 with the same hash. The beahvior is strange/wrong so maybe we could add a function that adds some check or specify the beahviour when several rows are candidates for a merge ?
I have 2 frames that contain network packet arguments (source IP, destination IP and so on), one frame captured at the source, one at the destination. In my python software (now rewriting in haskell), I mapped the 2 dataframes via a hash of the packet.
Which gives:
It compiles and it seems to run but after the innerJoin, there should be several columns with the same name. Doesn't that break the API somewhat ? how can I select the source IP between the 2 sourceIP present in the merged dataframe for instance ?
The text was updated successfully, but these errors were encountered: