Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hex2bytes for Vector{UInt8} #23267

Merged
merged 18 commits into from
Aug 22, 2017
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions base/strings/util.jl
Original file line number Diff line number Diff line change
Expand Up @@ -463,6 +463,55 @@ function hex2bytes(s::AbstractString)
return a
end

"""
hex2bytes(d::Vector{UInt8}, s::Vector{UInt8}, nInBytes::Int=length(s))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nInBytes is a bit of weird capitalization. Perhaps use ninbytes or perhaps even better just n,


Convert the first `nInBytes` hexadecimal bytes array to its binary representation. The
results are populated into a destination array. The function returns the number of bytes
copied into the destination array. The size of destination array must be at least half of
the `nInBytes` parameter.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use and example here:


# Examples
```jldoctest
julia> hex2bytes(...)
end
```

Copy link
Member

@stevengj stevengj Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hex2bytes now has an example, moved from hex2bytes!.

Typically, when we have both an in-place and out-of-place version of a function, we only give examples in the simpler out-of-place version.

function hex2bytes(d::Vector{UInt8}, s::Vector{UInt8}, nInBytes::Int=length(s))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this modifies its argument, it should be called hex2bytes!. It would be good to also add a 1-argument version that allocates the output for you, hex2bytes(s::Vector{UInt8}).

if isodd(nInBytes)
throw(ArgumentError("Input data length should be even"))
end

len2 = div(nInBytes, 2)
if size(d)[1] < len2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use length(d) here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this code uses @inbounds but doesn't check that nInBytes <= length(s).

throw(ArgumentError("Destination data buffer should be sufficiently large"))
end

i = 0
j = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will return 1 if the input is empty.

# This line is important as this ensures computation happens in word boundary and not
# byte boundary. Boundary computation can be almost 10 times slower
n = c1 = c2 = UInt(0)
for j = 1:len2
n = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why initialize n to zero?

@inbounds c1 = UInt(s[i+=1])
@inbounds c2 = UInt(s[i+=1])
n = get_number_from_hex(c1)
n <<= 4
n += get_number_from_hex(c2)
@inbounds d[j] = (n & 0xFF)
end
return j
end

@inline get_number_from_hex(c::UInt) = begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@inline function number_from_hex(c::UInt)
    # ...
end

const DIGIT_ZERO = UInt('0')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove all const.

const DIGIT_NINE = UInt('9')
const LATIN_UPPER_A = UInt('A')
const LATIN_UPPER_F = UInt('F')
const LATIN_A = UInt('a')
const LATIN_F = UInt('f')

return (DIGIT_ZERO <= c <= DIGIT_NINE) ? c - DIGIT_ZERO :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 nested tertiaries is a bit hard to reason about. Perhaps something like

DIGIT_ZERO    <= c <= DIGIT_NINE    && return c - DIGIT_ZERO
LATIN_UPPER_A <= c <= LATIN_UPPER_F && return c - LATIN_UPPER_A + 10
LATIN_A       <= c <= LATIN_F       && return c - LATIN_A + 10
throw(ArgumentError("not a hexadecimal number: '$(Char(c))'"))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a matter of personal preference and choice. A few lines above in hex2bytes(AbstractVector) has 3 nested tertiaries. Hence ignoring this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a matter of personal preference and choice. A few lines above in hex2bytes(AbstractVector) has 3 nested tertiaries. Hence ignoring this comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can fix it up at some other point, doesn't need to block this PR.

(LATIN_UPPER_A <= c <= LATIN_UPPER_F) ? c - LATIN_UPPER_A + 10 :
(LATIN_A <= c <= LATIN_F) ? c - LATIN_A + 10 :
throw(ArgumentError("Not a hexadecimal number"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps "$c is not a hexadecimal number"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

end

"""
bytes2hex(bin_arr::Array{UInt8, 1}) -> String

Expand Down