Skip to content
This repository has been archived by the owner on Nov 5, 2021. It is now read-only.

Add support for Array Type Checker #105

Merged
merged 8 commits into from
Mar 5, 2021
Merged

Add support for Array Type Checker #105

merged 8 commits into from
Mar 5, 2021

Conversation

shubhamnarlawar
Copy link
Contributor

@shubhamnarlawar shubhamnarlawar commented Feb 17, 2021

Purpose

Checks whether array typecast is possible from source type to destination type.

Related discussion - ballerina-platform/nballerina#50

Fixes #86

@shubhamnarlawar
Copy link
Contributor Author

Please review and suggest changes - @ruvi-d @manuranga

@shubhamnarlawar
Copy link
Contributor Author

This PR addresses issue #86.

@shubhamnarlawar
Copy link
Contributor Author

From issue - #86

Write wrapper function for is_same_type() to check whether original types of source and destination can be type casted.

The original types are available in smart pointers. The original types - for example - int, char, any, etc is supposed to be checked before callling is_same_type().

As discussed earlier, wrapper function is supposed to be implemented in rust. A represents array in is_same_type(). Probably I should take a for any data type in wrapper function.

Is wrapper function suggested to be implemented in rust or llvm side?

@ruvi-d
Copy link
Contributor

ruvi-d commented Feb 18, 2021

From issue - #86

Write wrapper function for is_same_type() to check whether original types of source and destination can be type casted.

The original types are available in smart pointers. The original types - for example - int, char, any, etc is supposed to be checked before callling is_same_type().

As discussed earlier, wrapper function is supposed to be implemented in rust. A represents array in is_same_type(). Probably I should take a for any data type in wrapper function.

Is wrapper function suggested to be implemented in rust or llvm side?

IMO, such decisions should be based on whether the check is at compile time or runtime. If the check needs to happen at compile time, the implementation should be on the C++/LLVM side. If it's a runtime check, then it should happen on the Rust side.

Copy link
Contributor

@ruvi-d ruvi-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see review comments

Comment on lines +64 to +65
let source_cstr: &CStr = unsafe { CStr::from_ptr(src_type) };
let dest_cstr: &CStr = unsafe { CStr::from_ptr(dest_type) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an assert!(!src_type.is_null()); before CStr::from_ptr(src_type). Do for dest_type as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 46 to 49
Some('C') => priority = 4,
Some('B') => priority = 1,
Some('F') | Some('I') => priority = 8,
Some('A') => priority = 16,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment defining the char to type mapping?
#50 suggested X for any type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please check it!

return false;
}
}
_ => println!("Unsupported data structure"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsupported should fail check. return false;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Done.

Some('B') => priority = 1,
Some('F') | Some('I') => priority = 8,
Some('A') => priority = 16,
_ => println!("Unsupported type"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return error Result for unsupported types

Copy link
Contributor Author

@shubhamnarlawar shubhamnarlawar Feb 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

_ => panic!("Unsupported type in mangled string"),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd return a std::result::Result from the function. The lib.rs side can check the return and panic if required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
let mut src_priority = 0;
let mut dest_priority = 0;
match source.chars().nth(0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define these magic numbers for bit positions as constants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit more in this suggestion?

Do you mean to use below snippet instead -

let constant = source.chars().nth(0);
match constant {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no.. I mean define the string indexes as constants like:

const BASE_TYPE_INDX: u32 = 0;
const ARRAY_MEMBER_TYPE_INDX: u32 = 1;
const ARRAY_SIZE_LSB_INDX: u32 = 2;
const ARRAY_SIZE_MSB_INDX: u32 = 3;
...
// Then use these constants when accessing the chars
match source.chars().nth(BASE_TYPE_INDX)

Hope that makes sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Makes sense. I have used usize instead of u32 as match expects a usize-

pub const BASE_TYPE_INDX: usize = 0;
pub const ARRAY_MEMBER_TYPE_INDX: usize = 1;
pub const ARRAY_SIZE_LSB_INDX: usize = 2;
pub const ARRAY_SIZE_MSB_INDX: usize = 3;

I will push this snippet along with other changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took some time, but I was able to come up with following code. Please do the research on rust language best practices in the future.

extern crate num;
extern crate num_derive;

use num::FromPrimitive;
use num_derive::FromPrimitive;

const ARRAY_MEMBER_TYPE_INDEX: usize = 2;

#[derive(Debug, PartialEq, FromPrimitive)]
#[repr(u32)]
enum BalType {
    Nil = 'N' as u32,
    Boolean = 'B' as u32,
    Int = 'I' as u32,
    Float = 'F' as u32,
    Decimal = 'D' as u32,
    String = 'S' as u32,
    Array = 'A' as u32,
    Any = 'X' as u32,
}

pub fn type_size(type_string: &str) -> i32 {
    let type_char = type_string.chars().nth(ARRAY_MEMBER_TYPE_INDEX)
                                      .unwrap_or_else(|| panic!("illegle type discripor '{}', wrong length", type_string));
    let bal_type = FromPrimitive::from_u32(type_char as u32)
                                      .unwrap_or_else(|| panic!("illegle type tag '{}' in type discripor '{}'", type_char, type_string));
    match bal_type {
        BalType::Boolean => { 1 }
        BalType::Int | BalType::Float => { 8 }
        BalType::String => { 12 }
        BalType::Any => { 16 }
        _ => { unimplemented!("type_size for '{:?}'", bal_type) }
    }
}

Cargo.toml

[dependencies]
num = "0.3"
num-traits = "0.2"
num-derive = "0.3"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the snippet! Sure. I will follow it in next tasks!

Let me modify other functions to follow above points!

//Index 2 and 3 represent the size
bits.push(type_string.chars().nth(2).unwrap());
bits.push(type_string.chars().nth(3).unwrap());
let total_bits: i32 = bits.parse().unwrap();
Copy link
Contributor

@ruvi-d ruvi-d Feb 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to handle unbounded Array sizes?

@ruvi-d
Copy link
Contributor

ruvi-d commented Feb 18, 2021

@shubhamnarlawar other general review comments:

  1. Shall we move the logic out of lib.rs and into separate modules? Let's try to keep lib.rs as thin as possible

  2. Add some Rust unit tests for the new functions

  3. Do we need to handle unbounded size arrays? I believe that on the BIR side unbounded is handled with a -1 size. How are we handling that?

@shubhamnarlawar
Copy link
Contributor Author

From issue - #86
Write wrapper function for is_same_type() to check whether original types of source and destination can be type casted.
The original types are available in smart pointers. The original types - for example - int, char, any, etc is supposed to be checked before callling is_same_type().
As discussed earlier, wrapper function is supposed to be implemented in rust. A represents array in is_same_type(). Probably I should take a for any data type in wrapper function.
Is wrapper function suggested to be implemented in rust or llvm side?

IMO, such decisions should be based on whether the check is at compile time or runtime. If the check needs to happen at compile time, the implementation should be on the C++/LLVM side. If it's a runtime check, then it should happen on the Rust side.

Got it. Thanks for the clarification.

@shubhamnarlawar
Copy link
Contributor Author

@shubhamnarlawar other general review comments:

1. Shall we move the logic out of lib.rs and into separate modules? Let's try to keep lib.rs as thin as possible

2. Add some Rust unit tests for the new functions

3. Do we need to handle unbounded size arrays? I believe that on the BIR side unbounded is handled with a `-1` size. How are we handling that?
  1. Yes. How about a new file typeChecker.rs?
  2. Sure. I will add unit tests at last in this PR itself after pushing above review changes.
  3. Let me check and get back to you on unbounded size arrays.

@ruvi-d
Copy link
Contributor

ruvi-d commented Feb 18, 2021

From issue - #86
Write wrapper function for is_same_type() to check whether original types of source and destination can be type casted.
The original types are available in smart pointers. The original types - for example - int, char, any, etc is supposed to be checked before callling is_same_type().
As discussed earlier, wrapper function is supposed to be implemented in rust. A represents array in is_same_type(). Probably I should take a for any data type in wrapper function.
Is wrapper function suggested to be implemented in rust or llvm side?

IMO, such decisions should be based on whether the check is at compile time or runtime. If the check needs to happen at compile time, the implementation should be on the C++/LLVM side. If it's a runtime check, then it should happen on the Rust side.

Got it. Thanks for the clarification.

Let us know if there is a good reason to use Rust logic for compile time checks and then maybe we can take it case by case.

@ruvi-d
Copy link
Contributor

ruvi-d commented Feb 18, 2021

  1. Yes. How about a new file typeChecker.rs?

I think it should be type_checker.rs inline with Rust naming conversions.

@shubhamnarlawar
Copy link
Contributor Author

From issue - #86
Write wrapper function for is_same_type() to check whether original types of source and destination can be type casted.
The original types are available in smart pointers. The original types - for example - int, char, any, etc is supposed to be checked before callling is_same_type().
As discussed earlier, wrapper function is supposed to be implemented in rust. A represents array in is_same_type(). Probably I should take a for any data type in wrapper function.
Is wrapper function suggested to be implemented in rust or llvm side?

IMO, such decisions should be based on whether the check is at compile time or runtime. If the check needs to happen at compile time, the implementation should be on the C++/LLVM side. If it's a runtime check, then it should happen on the Rust side.

Got it. Thanks for the clarification.

Let us know if there is a good reason to use Rust logic for compile time checks and then maybe we can take it case by case.

Sure.

At this point, there is no need to write wrapper function for calling 'is_same_type()'.

@shubhamnarlawar
Copy link
Contributor Author

@shubhamnarlawar other general review comments:

1. Shall we move the logic out of lib.rs and into separate modules? Let's try to keep lib.rs as thin as possible

2. Add some Rust unit tests for the new functions

3. Do we need to handle unbounded size arrays? I believe that on the BIR side unbounded is handled with a `-1` size. How are we handling that?

For point 3, we have not planned handling unbounded size arrays in this sprint. We need to discuss it in Monday's meeting -

Some quick thoughts to discuss -
What happens when one typecast such an array to a type that has a fixed length? What happens if the fixed length array is larger than the current length of the variable length array? What happens if it is smaller?

pub fn calculate_type_priority(type_string: &str) -> Result<i32, &'static str> {
let priority: i32;
match type_string.chars().nth(ARRAY_MEMBER_TYPE_INDX) {
Some('C') | Some('B') => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be descriptive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added comment at top of this function-

/*
* Computes size of the type
* Type Notations -
* C represents character
* B represents boolean
* F represents float
* I represents integer
* S represents string
* X represents any*/

Is using a macro suggested here ? Like -

macro_rules! Char { () => {'C'}; }

and In the match case, it will be - Some(Char!())

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manuranga - In yesterday's call, you suggested to use enum. But here at some places, two entities require same value -

for example - Some('C') | Some('B') - here both characer and boolean requires a value of 1.

Can you share example to make it more descriptive? You mentioned that you have known in other languages in C++ to make this match case more descriptive. Can you please share here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi looks like rust doesn't support this out of the box yet rust-lang/rfcs#3040
But it seems like there is a common way to do this : https://stackoverflow.com/a/28029279
Please try to use this approach

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From above link - I can write enum as below -

enum TYPE_SIZE {
CHAR = 1,
INT = 8,
STRING = 12,
ANY = 16,
}
and so on

but for boolean data type with size 1 byte, it also need 1
similarly, float needs 8. Here we have repetitive values. How would enum works in this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I am suggesting not to have unnamed constants like C and X. This is the same reason we use constants instead inlined numbers https://stackoverflow.com/a/47902.

I was not suggesting to have priorities be encoded into to enum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I misinterpreted it.

How about using below constants instead of just C and X? -

pub const CHARACTER: char = 'C';
pub const BOOLEAN: char = 'B';
pub const INTEGER: char = 'I';
pub const FLOAT: char = 'F';
pub const STRING: char = 'S';
pub const ANY: char = 'X';
pub const ARRAY: char = 'A';

Some('A') => {
if source.chars().nth(ARRAY_MEMBER_TYPE_INDX)
!= destination.chars().nth(ARRAY_MEMBER_TYPE_INDX)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this line formatted using rustfmt ? brace seems to be on the wrong line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using cargo-fmt. I tried with rustfmt but getting same result. Let me reformat it once again!

return false;
}
let mut src_type_size: Result<i32, &'static str> = Ok(0);
let mut dest_type_size: Result<i32, &'static str> = Ok(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Rust, it's a best practice to use mut minimally. If you can avoid it, it's better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Sure

* F represents float
* I represents integer
* S represents string
* X represents any*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shubhamnarlawar Please read https://ballerina.io/spec/lang/draft/v2020-12-17/#section_5.1.4
these are the type we need to support. We can start by selected subset, but there is no type called character.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. From the above enum you wrote, we can treat it as a subset!

@ruvi-d
Copy link
Contributor

ruvi-d commented Mar 1, 2021

@shubhamnarlawar remember to create a separate type_checker.rs file similar to how we created a new file for the map implementation.

@shubhamnarlawar
Copy link
Contributor Author

@shubhamnarlawar remember to create a separate type_checker.rs file similar to how we created a new file for the map implementation.

I have moved all functions except is_same_type() and tests to type_checker.rs. Please check it!

@shubhamnarlawar
Copy link
Contributor Author

Is this PR ready to merge now?

@ruvi-d
Copy link
Contributor

ruvi-d commented Mar 3, 2021

Is this PR ready to merge now?

You will have to fix the conflicts with master. When the PR is ready to review, you can mark my request for change complete and request for review again.

@shubhamnarlawar shubhamnarlawar requested a review from ruvi-d March 3, 2021 14:34
@shubhamnarlawar
Copy link
Contributor Author

Is this PR ready to merge now?

You will have to fix the conflicts with master. When the PR is ready to review, you can mark my request for change complete and request for review again.

I have resolved the conflict with master. No merge conflict now!
I have re-requested you for a review.

@shubhamnarlawar
Copy link
Contributor Author

@ruvi-d - Please give suggestions if any on latest changes pushed.

Copy link
Contributor

@ruvi-d ruvi-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better now with the Rust unit tests.
Just one more suggestion. Move the is_same_type function to the type_checker module. Change it's input args to take Rust strings. On the lib.rs, only do the String construction and directly pass it to type_checker module. Similar to how the Map implementation has been done.
The idea here is to keep lib.rs as thin as possible, doing only the data type conversions. The actual logic will be kept on a separate, reusable(independent) Rust module.

@shubhamnarlawar
Copy link
Contributor Author

Much better now with the Rust unit tests.
Just one more suggestion. Move the is_same_type function to the type_checker module. Change it's input args to take Rust strings. On the lib.rs, only do the String construction and directly pass it to type_checker module. Similar to how the Map implementation has been done.
The idea here is to keep lib.rs as thin as possible, doing only the data type conversions. The actual logic will be kept on a separate, reusable(independent) Rust module.

Done. Please check it.

I have moved logic of is_same_type() into a separate function called same_type() in type_checker.rs Now, the code is much cleaner.

Copy link
Contributor

@ruvi-d ruvi-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am OK to merge. Best if @manuranga can double check the Rust lang specific stuff.

@manuranga
Copy link
Contributor

Still little bit unsure about the algo, but we'll catch if there is any issues with lit tests later.

Looks good to merge.

@shubhamnarlawar
Copy link
Contributor Author

Sounds good. Can you mark it as review complete and approve?

@Kishanthan Kishanthan merged commit 9fbebb6 into master Mar 5, 2021
@shubhamnarlawar shubhamnarlawar deleted the is_same_type branch April 20, 2021 14:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement is_same_type for array type
4 participants