top | item 39046844

HAR file

33 points| yevyevyev | 2 years ago |docs.multiple.dev

Hey HN,

We just shipped a new AI-powered feature... BUT the "AI" piece is largely in the background. Instead of relying on a chatbot, we've integrated AI (with strict input & output guardrails) into a workflow to handle two specific tasks that would be difficult for traditional programming:

1. Identifying the most relevant base URL from HAR files, since it would be tedious to cover every edge case or scenario to omit analytics, tracking, and other network noise.

2. Generating synthetic data for API requests by passing the API context and faker-js functions to GPT-4.

The steps are broken down into a simple flow, with users working with the AI and verifying the output throughout.

All of the focus is on reducing cognitive load and speeding up test generation.

Let me know what you think!

16 comments

[+] pitah1|2 years ago|reply

Interesting feature. One key thing I found when testing is that for you to reproduce the set of steps the user went through, there are a data attribute(s) that need to remain the same. For example, after login, a request for your account information will contain an account_id number that should be the same for all other account requests. If you can't guarantee this, then I don't see how you could use this in any sort of integration tests.

Isn't it simpler to use the Open API spec then generate from there?

[+] jon_a_rey|2 years ago|reply

Hi @pitah1, thanks for the comment! I'm Jon, the guy who built the feature :)

You're spot on here. Unique identifiers within a flow shouldn't necessarily be replaced with a randomly generated string. We've attacked this problem from two different angles.

1. If an API request has a JSON response, we'll generate JS load test code for that request that begins with `const responseA = await axios[...];`. You can edit the load test to use the response data in subsequent requests.

2. We also attempt to intelligently replace UUIDs (or predictable identifiers) with a placeholder like `{fieldName}`. This highlights the values that need user intervention.

We use the Swagger schema to determine the available API endpoints, if each endpoint has a request body vs query string params vs none, and so on. We sprinkle in AI to help decide how to best saturate request bodies with realistic data via faker.js.

[+] dmitry_dygalo|2 years ago|reply

This reminds me of several solutions albeit lacking the explicit "AI" part:

- Up9 observes traffic and then generates test cases (as Python code) & mocks

- Dredd is built with JavaScript, runs explicit examples from the Open API spec as tests + generates some parts with faker-js

- EvoMaster generates test cases as Java code based on the spec. However, it is a greybox fuzzer, so it uses code coverage and dynamic feedback to reach deeper into the source code

There are many more examples such as Microsoft's REST-ler, and so on.

Additionally, many tools exist that can analyze real traffic and use this data in testing (e.g. Levo.ai, API Clarity, optic). Some even use eBPF for this purpose.

Given all these tools, I am skeptical. Generating data for API requests does not seem to me to be that difficult. Many of them, already combine traffic analysis & test case generation into a single workflow.

For me, the key factors are the effectiveness of the tests in achieving their intended goals and the effort required for setup and ongoing maintenance.

Many of the mentioned tools can be used as a single CLI command (not true for REST-ler though), and it is not immediately clear how much easier it would be to use your solution than e.g. a command like `st run <schema url/file>`. Surely, there will be a difference in effectiveness if both tools are fine-tuned, but I am interested in the baseline - what do I get if I use the defaults?

My primary area of interest is fuzzing, however, at first glance, I'm also skeptical about the efficacy of test generation without feedback. This method has been used in fuzzing since the early 2000s, and the distinction between greybox and blackbox fuzzers is immense, as shown by many research papers in this domain. Specifically in the time a fuzzer needs to discover a problem.

Sure, your solution aims at load testing, however, I believe it can benefit a lot from common techniques used by fuzzers / property-based testing tools. What is your view on that?

What strategies do you employ to minimize early rejections? That is, ensuring that the generated test cases are not just dropped by the app's validation layer.

[+] yevyevyev|2 years ago|reply

Hi Dmitry, thanks for replying here. You raise some good points - I'll do my best to address them below. I'll also add that we have a fully-featured free tier that you can sign up for on our website (www.multiple.dev). Any hands-on feedback from our product or the TestGen feature would be extremely helpful.

Test feedback - during our TestGen flow, the user provides feedback on the sequence and contents of the API requests. And at the end of the flow, our users can manually edit the resulting JS code for additional customization.

Effort to create a load test - You can go from a Swagger or HAR file to a function load test, written in JS, in a few minutes. There is no learning curve, assuming you have basic knowledge of JavaScript. Maintenance is typically minimal.

CLI - we are launching our CLI shortly, where users can start tests from command line as you describe. It'll work similarly to Jest or other unit test frameworks, where the test scripts will live in our user's codebase.

The use of AI - we use AI to generate realistic-looking synthetic data, which can be challenging with strings. The AI matches each field to the most relevant faker-js function. We need the content of the string to look like something the target application would receive in production. And with HAR files, we use AI to help filter out irrelevant requests such as analytics.

I hope that was helpful, and I'm happy to go into more detail.

[+] ushakov|2 years ago|reply

Why is AI needed for this at all?

You should take a look at Schemathesis (https://github.com/schemathesis/schemathesis)

[+] yevyevyev|2 years ago|reply

From my understanding, Schemathesis can generate data based on a value being a string, number, boolean, etc. It also seems fairly manual to set up and has a learning curve. Our output is JavaScript that can be run anywhere.

With our TestGen feature, the AI looks at example requests in a HAR file or Swagger examples, or it can solely rely on the name of the property. From there, it automatically generates the correct type and format of data - e.g., if a field is named "address," it generates a value that looks like an address and is formatted in the same way as examples. It wouldn't be practical to cover every potential edge case and scenario without AI.

[+] cebert|2 years ago|reply

How is this different from Postman’s test generation features? https://www.postman.com/postman-galaxy/dynamically-generate-...

[+] yevyevyev|2 years ago|reply

Postman generates data based on datatypes in the OpenAPI spec: strings, numbers, booleans, etc - but the data will not look realistic. The video outlined a rudimentary test that checked if required fields were present.

Our TestGen feature generates realistic-looking data, such as dates, names, addresses, URLs, etc, automatically based on the field names, examples, and other API spec metadata. It does this automatically, without human intervention. The output is JavaScript, so if further customization is needed, such as using a response value of an API call in a subsequent request, you can do that.

[+] xwowsersx|2 years ago|reply

This can be used to generate data given an openAPI spec? Bit unclear on whether that's unbundled from test generation. Say I just want to generate data that conforms to a spec as one-off. Can this be done?

[+] yevyevyev|2 years ago|reply

That’s a great question. The TestGen feature generates JavaScript that uses faker-js functions to generate the test data, and axios to make http requests. You can copy and paste the JS output of TestGen and run it anywhere.

[+] edrenova|2 years ago|reply

just to jump in here - you can do this with Neosync, an OSS synthetic data generator and orchestrator for any schema and includes relational integrity, anonymization, + more

(I'm one of the co-founders)

If you're interested (github.com/nucleuscloud/neosync)