I need to work with some systems that use JMESPath to search JSON. But I found that it is missing a lot of important features. Let's say, it is very hard to search string with pattern (like this). It does not support regular expression. It does not support case insensitive search. The proposal to add split function has been frozen since 2017 (like this and this). These features are all available to jq. So I want to know why systems like AWS S3 CLI, and Ansible use JMESPath instead of jq to query JSON?
Asked
Active
Viewed 2,059 times
9
-
4For those who vote to close this question, this is a fact based question because I am seeking for evidence. Not just opinion. There are many similar questions in Stackoverflow (e.g. https://stackoverflow.com/questions/383692/what-is-json-and-why-would-i-use-it?rq=1) that are not closed. – HKTonyLee Jun 18 '21 at 18:35
-
1IMO, your edit makes it even more opinion based. The only persons who can answer you, fact based, why this was favoured over that in a tool, language or framework will be the architect or développer the of the said tool, language or framework. And even so, the answer will be so much specific it won’t be something valuable. – β.εηοιτ.βε Jun 20 '21 at 11:20
-
1Real example: in Magento 1.x, prototype.js was used instead of jQuery. We all know the popularity of jQuery, so it would be a no-brained choice now. *But*, back then, they were both as popular, so the developers just did a bet on the wrong horse. Although the idea to change it was there, it was too complex and such a retro compatibility issue that they only ended up rectifying this in version 2.x. – β.εηοιτ.βε Jun 20 '21 at 11:24
-
Thanks @β.εηοιτ.βε for your comment! But I think the answer below by @peak has a pretty good reason for that. `jq` can easily blow server resource but JMESPath does not. A typical server would not accept running `jq` program because of that. This is a double-edged sword as what I can say. – HKTonyLee Jun 22 '21 at 00:34
-
Even though for prototype.js vs jQuery, there must be some reason for the Magento architects to use that. Popularity is a reason. But they must have some stronger reason(s) to beat popularity. They didn't decide to go for prototype.js because "I like it more". And I seek for those reason(s). – HKTonyLee Jun 22 '21 at 04:08
-
I believe this roots from 2 different mindsets. Mine is like "ok they are opinionated, but are they really opinionated?" People who vote to close my question is like "ok they are opinionated. People who ask the same questions are also opinionated because there must be no other reasons." – HKTonyLee Jun 22 '21 at 04:10
-
That a technically interesting point of view, indeed. But that’s this user opinion. It is not the reason I use JMESPath, and the reason I do is probably different that the reason Ansible does or AWS cli does… So all you are going to gather are opinionated answers. – β.εηοιτ.βε Jun 22 '21 at 09:13
-
Look, we even had the chance in [this question](https://stackoverflow.com/questions/55884514/what-is-the-incentive-for-curl-to-release-the-library-for-free) to have the point of view of the developer of the tool. It is **the** answer, as you cannot argue that the person who wrote cURL does not have the reasoning behind a choice done there. Still, it has been closed as opinion based. – β.εηοιτ.βε Jun 22 '21 at 09:16
-
The [question](https://stackoverflow.com/questions/55884514/what-is-the-incentive-for-curl-to-release-the-library-for-free) you mentioned is reopened because it should not be opinion-based. Please check the related discussion [here](https://meta.stackoverflow.com/questions/384376/are-questions-about-the-motives-of-programming-library-developers-on-topic/384400#384400). I don't know the answer so I ask. And hope someone who knows the answer to answer :-/ – HKTonyLee Jun 23 '21 at 18:24
-
If people think my question is written badly, please kindly reword my question, or downvote it. Closing the question is the worst form because this disallows anyone to contribute to this question. – HKTonyLee Jun 23 '21 at 18:27
1 Answers
7
It's not so much about the difference between JMESPath and jq as the different ways they are used.
Suppose you are querying a remote resource, the result is going to number in the millions of records, but you only care about a specific, much smaller subset of the records. You have two choices:
- Have every record transmitted to you over the network, then pick out the ones you want locally
- Send your filter to the remote resource, and have it do the filtering, only sending you the response.
jq is typically used for the former, JMESPath for the latter. There's no reason why the remote service couldn't accept a jq filter, or that you couldn't use a JMESPath-based executable.
chepner
- 446,329
- 63
- 468
- 610
-
1Thanks for your answer! This is getting interesting. I guess because jq is a Turing-complete language so it is less suitable to run arbitrary jq language in server-side. I am not familar with JMESPath but let me check if it is Turing-complete. – HKTonyLee Jun 18 '21 at 20:35
-
1I should stress that my answer is based solely on how I see each *used*, and not so much on what either would be *suitable* for. Perhaps JMESPath is optimized to be simpler to write, but less powerful. (Or maybe JMESPath was more powerful when introduced, and so made people choose it as their query language, but `jq` caught up in the meantime.) – chepner Jun 18 '21 at 20:37
-
2Or JMESPath puts an emphasis on *filtering*, while `jq` emphasizes *transformation*. – chepner Jun 18 '21 at 20:38
-
I searched some of the links in the Internet (e.g. https://github.com/serverlessworkflow/specification/issues/216, https://forum.snapcraft.io/t/jmespath-in-the-snap-tooling-we-need-your-help/4108/2) they all mentioned JMESPath is lacking the features they needed. But no one said jq does not have features they needed. The only complain to jq is that it lacks a well-defined spec. I would assume JMESPath is less powerful than jq. – HKTonyLee Jun 18 '21 at 20:43
-
I agree your "JMESPath puts an emphasis on filtering, while jq emphasizes transformation" is a good summary. – HKTonyLee Jun 18 '21 at 20:44
-
5I suspect the reason has to do with the fact that it is trivially easy to write jq programs that will consume as much CPU and/or RAM as is available. Consider e.g. `range(0;infinite)` or `[range(0;infinite)]` – peak Jun 18 '21 at 22:30
-
Thanks @peak! That is a very good example. Attackers can easily DDOS servers that accept `jq` – HKTonyLee Jun 22 '21 at 04:22
-
-
I am afraid this is not enough. That needs ulimit to restrict the memory usage as well. That means `jq` must be run in separated process. – HKTonyLee Jun 22 '21 at 18:18
-
Sending a `jq` filter to the server is a no-go because it is Turing Complete. Downvoted because you confidently say "There's no reason why ..." and won't fix it even though the first comment pointed out this issue. – user2297550 Jan 03 '22 at 16:15
-
I'm referring to *techincal* reasons. If the server wants to accept the risk of processing a non-terminating `jq` filter, it's free to do so. – chepner Jan 03 '22 at 16:20