Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to get path of queried nodes #76

Closed
FruitieX opened this issue Jan 17, 2024 · 11 comments · Fixed by #78
Closed

Ability to get path of queried nodes #76

FruitieX opened this issue Jan 17, 2024 · 11 comments · Fixed by #78

Comments

@FruitieX
Copy link

Hi,

I have a use-case where I would need to get the path (in some format) and value of queried nodes, this doesn't seem possible at the moment. How much work do you think this would require? I might take a stab at this eventually. 😄

@FruitieX
Copy link
Author

#77

@hiltontj
Copy link
Owner

Hey @FruitieX - thanks for opening the issue. I see the PR and may take a look at that but think it would be good to discuss here a bit first.

This concept is documented in the IETF standard, see Normalized Paths. There is a C# IETF JsonPath implementation that has them built into the query evaluation mechanism, and which you can see in their demo environment.

Unfortunately, serde_json_path, as you have discovered, does not cover this portion of the standard. I have some ideas though, which is why I want to discuss first, and which I will try to get down here when I get a minute or two. Ultimately, my time is a bit tight to implement this these days, so if it is something that you could take the lead on then that would be awesome!

@hiltontj
Copy link
Owner

I would start by defining a type to represent a normalized path, e.g.,

struct NormalizedPath<'a>(Vec<PathElement<'a>>);

enum PathElement<'a> {
    String(&'a str), // for object keys
    Index(usize), // for list indices
}

This representation should allow you to construct it as you recurse down the JSON structure using borrowed/copied data from the JSON value being queried, vs. having to clone everything into an owned string. That is, 'a here is the lifetime of the JSON value being queried - string keys from that JSON value should have the same lifetime. Array indices are usize which is copy, so no need to worry about lifetimes for those.

@FruitieX
Copy link
Author

Thanks for the quick response, and yeah, that sounds more reasonable than my current implementation. I will try to adapt it into using borrowed data instead.

@hiltontj
Copy link
Owner

@FruitieX - I may have some time to implement this. I am curious, based on your description and your PR, you are looking for both the path and the node to be returned, e.g.,

impl JsonPath {
    fn query_path<'a>(&self, value: &'a Value) -> Vec<(NormalizedPath<'a>, &'a Value)> {
        /* ... */
    }
}

Is it important that both the node, i.e., the &'a Value is also returned, or would you just need the path. I am trying to decide if it is better to do the above, or just return the paths, e.g.,

impl JsonPath {
    fn query_path<'a>(&self, value: &'a Value) -> Vec<NormalizedPath<'a>> {
        /* ... */
    }
}

It would be helpful to understand your use-case a bit more and why you need both, if that is the case.

@FruitieX
Copy link
Author

FruitieX commented Jan 26, 2024

My use-case is a bit unusual, but basically I have a JsonPath with wildcards and I need to know which keys matched the wildcards for each matched value. While less convenient, returning just the paths would also work, as I could then use the path to build a JSON pointer and query the underlying value with serde_json::value::Value::pointer()

The long(er) version is:

I'm writing a program that controls my home automation devices. I'm allowing users (=myself 😃) to configure the behavior of their devices using a simple scripting language called evalexpr. It doesn't have JSON support, but instead I can make use of simple variable assignments like:

devices.hue.office.state.power = true;
devices.hue.office.state.color.r = 255;
devices.hue.office.state.color.g = 128;
devices.hue.office.state.color.b = 0;

devices.lifx.bedroom.state.power = true;
devices.lifx.bedroom.state.brightness = 0.5;

I then build a serde_json::Value from the resulting variable context, and start querying it with serde_json_path. For example to get the state struct of each device I can use the following query:

$.devices.*.*.state

In addition to the state, I now need to know the vendor ID and device ID from the path, so that I know which devices to send the state updates to.

There are probably better ways of doing this, but this way was pretty convenient to implement since my state structs already implemented Serialize/Deserialize due to them being directly used in a REST API / web frontend.

@hiltontj
Copy link
Owner

Thank you for the write-up @FruitieX!

To be clear, my understanding is that you use the JSONPath to query for the state objects, but then you require the paths themselves to extract the vendor and device IDs, e.g., lifx/bedroom and hue/office, that are associated with the given state objects provided by the query.


Given that Normalized Paths are a part of the JSONPath spec, and this crate is meant to support that spec, I feel obliged to incorporate the feature. Since this could affect the API and underlying query execution in a substantial way, I want to determine if there is significant overhead to having the underlying query logic produce Vec<(NormalizedPath<'a>, &'a Value)> vs. what it is currently doing, i.e., Vec<&'a Value>. So, I can't necessarily promise quick delivery, but I have already started putting something together in #78.

@FruitieX
Copy link
Author

FruitieX commented Jan 26, 2024 via email

@hiltontj
Copy link
Owner

hiltontj commented Feb 1, 2024

Hey @FruitieX - I have made some solid headway in #78. With it, you would be able to do something like:

struct Device {
    vendor_id: String,
    device_id: String,
    state: Value,
}

let config = json!({ /* JSON of home devices configuration */});
let path = JsonPath::parse("$..state")?; // use `..` operator to get all nested `state` nodes
let devices: Vec<Device> = path
    .query_located(&config) // use new `query_located` method
    .iter() // iterate over `LocatedNode`s
    .map(|q| { // map them into the `Device` type (or whatever)
        let loc = q.location(); // get the location, i.e., full normalized path to `state` node
        // extract elements of interest from path:
        let vendor_id = loc.get(1).to_string();
        let device_id = loc.get(2).to_string();
        // the state object itself is the node that was queried for:
        let state = q.node().to_owned();
        Device { vendor_id, device_id, state }
    })
    .collect();

@FruitieX
Copy link
Author

FruitieX commented Feb 2, 2024

Excellent, this should cover all my needs, thanks!

hiltontj added a commit that referenced this issue Feb 2, 2024
@hiltontj
Copy link
Owner

hiltontj commented Feb 3, 2024

@FruitieX - just released v0.6.5. Thank you again for raising the issue and for the helpful discussion! 🍻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants