Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: allow passing hash to Instances#add_instance #16

Merged
merged 12 commits into from
Dec 25, 2016
Merged

Feature: allow passing hash to Instances#add_instance #16

merged 12 commits into from
Dec 25, 2016

Conversation

kcning
Copy link
Contributor

@kcning kcning commented Dec 23, 2016

At the moment adding instance can only be done by providing an Instance or an array of attribute values sorted according to the attribute order. It would be nice to be able to pass a hash of attribute values so we don't have to care about the order.

@kcning kcning closed this Dec 23, 2016
@kcning kcning reopened this Dec 23, 2016
Copy link
Owner

@paulgoetze paulgoetze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a nice feature. I appreciate your contributions. For upcomig changes and feature proposals please make sure you open an issue for the proposal first, in order to open up the discussion at first. It also prevents you from spending time on sth. that might be rejected or need to be changed later on.

names = attribute_names(true).map!(&:to_sym)
names.inject([]) do |values, attribute_name|
values << attribute_values[attribute_name]
end
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the names.inject block you can use Hash#fetch_values:

attribute_values.fetch_values(*names)

Also, should attribute_values be named hash? For me it sounds more like an array just from its name...

names = attribute_names(true).map!(&:to_sym)
names.each_with_object({}).with_index do |(attr_name, hash), index|
hash[attr_name] = attribute_values[index]
end
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is values instead of attribute_values and name for attr_name enough? Just trying to make it as concise as possible.

Maybe using inject is slightly more readable. What about sth. like this?:

names.each_with_index.inject({}) do |hash, (name, index)|
  hash.update(name => values[index])
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is prettier!

# @return [Array] an array containing attribute values in the
# correct order
def attribute_values_from_hash(attribute_values)
names = attribute_names(true).map!(&:to_sym)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using map! with a bang does not make to much sense, if we assign it afterwards anyway, right?
(Same in attribute_values_to_hash())

@@ -49,12 +49,16 @@ def instances
enumerate_instances.to_a
end

def attributes
enumerate_attributes.to_a
def attributes(include_class_attribute = false)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather make it a keyword argument (def attributes(include_class_attribute: false)) in order to force clarity for the user.

I don't really like to provide attributes(true), because a user doesn't know what the true is going to do. With a keyword argument, you will have to name it explicitly: attributes(include_class_attribute: true).
A bit longer, I must admit, but way more understandable.

The same applies for attribute_names() below.

if include_class_attribute && class_attribute_defined?
attrs.insert(class_index, class_attribute)
end
attrs
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add some spaces between the if-block and the single lines (also applies for map/inject blocks). It eases reading the code a lot. :)

use_hash = values.is_a?(Hash)
values = attribute_values_from_hash(values) if use_hash

values = values.each_with_index.map do |value, index|
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values.map.with_index should work, right?

end

it 'does not return the class attribute' do
expect(subject.attribute_names).to eq(names.reject { |n| n == 'windy' })
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using include might be more elegant:

expect(subject.attribute_names).not_to include 'windy'
# or even this, to keep it even more independent:
expect(subject.attribute_names).not_to include subject.class_attribute.name


let(:humidity_value) { 80 }
let(:windy_value) { 'TRUE' }
let(:play_value) { :yes }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of weird to see the actual values be defined after their usage. Should the lets be defined before let(:data)...?

context 'when each instance is stored as a hash of attribute values' do
let(:hash_data) do
data.map do |attribute_values|
subject.send(:attribute_values_to_hash, attribute_values)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, whether it is a good idea to depend on a private method here. If for some reason it is renamed/removed, this test will break, even if the result is totally fine. Maybe it is better to just redefine the hashes explicitely?

humidity: 85.0,
windy: 1,
play: 1
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figure values & internal_values should be let-blocks. Doesn't make a difference for the static values here, but just for consistency.

@paulgoetze paulgoetze changed the title Feature/add instance Feature: allow passing hash to Instances#add_instance Dec 23, 2016
@paulgoetze
Copy link
Owner

paulgoetze commented Dec 23, 2016

@kcning Another small thing: could you stash your commits or make sure they describe the changes properly?
Commit messages like tidy up or bug fix are not too descriptive if you want to know what has changed (also consider rebasing if you pull changes from the upstream branch) Thank you! :)

@kcning
Copy link
Contributor Author

kcning commented Dec 24, 2016

Thanks for the quick review! I fixed the problems. Please let me know if more should be changed.

@paulgoetze
Copy link
Owner

@kcning Hm, looks like somehow three of my earlier commits sneaked into your changes (now with other hashes).

Removing them and rebasing on top of your develop should clean this up:

# for each of the three commits:
git rebase --onto commit-id^ commit-id

# then rebase on your develop branch:
git rebase develop

(Maybe make a backup of you branch before, in case something goes wrong)

        1) add support for adding attribute values passed via a hash
        2) add testcases for it
        1) refactor #attribute_values_to_hash
        2) rewrite comment
        One of the Java method of Instances has unexpected behavior.
        enumerateAttributes will skip the class attribute once it is
        set. Therfore we have to manually add the class attribute,
        otherwise we we will miss it in every method that calls
        Instances#attributes.

See also:
http://weka.sourceforge.net/doc.stable/weka/core/Instances.html#enumerateAttributes--
        1) The previous fix only works when class attribute is the last
        attribute. The bug fix address that problem.

        2) use attribute in the middle as the class attribute for the
        test cases in the spec.
        1) use keyword argument for #attributes and #attribute_names
        2) fix #ensure_attribute_defined! and #attribute_with_name
        3) refactor private methods #attribute_values_to_hash and
        #attribute_values_from_hash
        4) add new test cases and refactor the spec
@kcning
Copy link
Contributor Author

kcning commented Dec 25, 2016

oke, it's done!

@paulgoetze
Copy link
Owner

Thanks @kcning for your patience with all my nitpicking on changes and codestyle! ;) Looks great.

@paulgoetze paulgoetze merged commit a1ed62a into paulgoetze:develop Dec 25, 2016
@kcning kcning deleted the feature/add-instance branch December 25, 2016 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants