top | item 38012032

Show HN: OpenAPI DevTools – Chrome extension that generates an API spec

811 points| mrmagoo2 | 2 years ago |github.com

Effortlessly discover API behaviour with a Chrome extension that automatically generates OpenAPI specifications in real time for any app or website.

102 comments

order
[+] the_absurdist|2 years ago|reply
I wish this would document the auth headers.

What would be particularly useful is if it saved token values and then (through search) joined them on the response of the auth call to get the initial token.

That way you could easily determine what auth call was needed to get you a token to use the endpoint.

[+] mrmagoo2|2 years ago|reply
Great suggestion, I will look into this.
[+] ttul|2 years ago|reply
This is super cool. Writing code to drop into the JavaScript console lets you do insane things. I’ve found great success using ChatGPT to help me write the code, which I then just cut and paste into the console. Asking it to “make it all run in parallel using async/await” will massively speed up execution of serial tasks.

For instance, I had GPT help me write browser JS that groks literally thousands of IP addresses in an open security tool that shall not be named. I can vacuum much of their entire database in seconds by making hundreds of async calls. While they do have bot protection on the website, they appear to have no protection at all on their browser APIs once the user has been given a cookie… I suspect this is common.

[+] jhardy54|2 years ago|reply
This is about OpenAPI (Swagger), not OpenAI (ChatGPT).
[+] archiewood|2 years ago|reply
My most common use case here is to then want to hit the API from python, and adjust the params / url etc.

Would love a "copy to python requests" button that

grabs the headers

generates a boilerplate python snippet including the headers and the URL:

    import requests
    import json

    url = '<endpoint>'

    headers = {
        'User-Agent': 'Mozilla/5.0 ...',
        ...
    }

    data = {
        "page": 5,
        "size": 28
        ...
    }

    response = requests.post(url, headers=headers, data=json.dumps(data))

    if response.status_code == 200:
        print(response.json())
    else:
        print(f"Error {response.status_code}: {response.text}")
[+] ea016|2 years ago|reply
Steps to do so:

- open the network console

- right click on the request

- click "copy as curl"

- visit https://curlconverter.com/ to convert to Python/Node/any language

[+] gabrielsroka|2 years ago|reply
1. You should almost always use requests.Session() instead of requests. It's faster, and can make the code shorter.

2. requests can dump to JSON for you by using json=, so you don't need a separate module. It'll even set the content-type header to application/json for you.

  import requests
  
  url = '<endpoint>'
  
  headers = {
      'User-Agent': 'Mozilla/5.0 ...',
      ...
  }
  
  session = requests.Session()
  session.headers.update(headers)
 
  data = {
      "page": 5,
      "size": 28
      ...
  }
  
  response = session.post(url, json=data)
  
  if response.status_code == 200:
      print(response.json())
  else:
      print(f"Error {response.status_code}: {response.text}")
[+] westurner|2 years ago|reply
SeleniumIDE can record and save browser test cases to Python: https://github.com/SeleniumHQ/selenium-ide

awesome-test-automation/python-test-automation.md lists a number of ways to wrap selenium/webdriver and also playwright: https://github.com/atinfo/awesome-test-automation/blob/maste...

vcr.py, playback, and rr do [HTTP,] test recording and playback. httprunner can record and replay HAR. DevTools can save http requests and responses to HAR files.

awesome-web-archiving lists a number of tools that work with WARC; but only har2warc: https://github.com/iipc/awesome-web-archiving/blob/main/READ...

[+] yread|2 years ago|reply
wow what a perfect service to steal session cookies
[+] jimmySixDOF|2 years ago|reply
Nice this made me go back and check up on the Gorilla LLM project [1] to see whats they are doing with API and if they have applied their fine tuning to any of the newer foundation models but looks like things have slowed down since they launched (?) or maybe development is happening elsewhere on some invisible discord channel but I hope the intersection of API calling and LLM as a logic processing function keep getting focus it's an important direction for interop across the web.

[1] https://github.com/ShishirPatil/gorilla

[+] ricberw|2 years ago|reply
This is awesome!

I'll second/third the feature request for auto-including auth headers/calls (as many of the sites I'm trying to understand/use APIs from use persistent keys, and scraping these separately is just unnecessary extra time).

On that same note, I'd greatly appreciate keeping the initial request as a "sample request" within the spec.

I'd also greatly appreciate an option to attempt to automatically scrape for required fields (e.g. try removing each query variable one at a time, look for errors, document them).

Thanks for this :)

[+] autonomousErwin|2 years ago|reply
This is a first step into turning the entire web into an API albeit before we hit the login/signup roadblocks (but then that's where agents come in)
[+] toyg|2 years ago|reply
That's used to be called "the semantic web".

Dreams never die and what is old will be new again.

[+] digitalsanctum|2 years ago|reply
Great project! These features come to mind that would be great additions:

1. Ability to filter response properties.

2. Ability to work with non-JSON (web scraping) by defining a mapping of CSS selectors to response properties.

3. Cross-reference host names of captured requests with publicly documented APIs.

4. If auth headers are found, prompt user for credentials that can then be stored locally.

5. "Repeater" similarly found in Burp Suite.

6. Generate clients on the fly based on the generated OpenAPI spec.

[+] worldsayshi|2 years ago|reply
- Allow using it as a library instead of just a browser extension which would in turn allow:

- Integration with some kind of web crawler to allow automatically walking a web site and extract a database of specifications

Edit: Hmm, it seems that genson-js[1] was used to merge schemas.

1 - https://www.npmjs.com/package/genson-js

[+] digitalsanctum|2 years ago|reply
7. Train a machine learning model to recognize and extract tabular and repeated data based on training data.

8. Optionally publish generated OpenAPI specs to a central site or open PR to a GH repo, "awesome-openapi-devtools"?

[+] mrmagoo2|2 years ago|reply
Some great ideas here, thank you. I do want to keep it small and focused so I'll forego complex functionality like the Repeater, but you've raised some common pain points I'll tackle.
[+] ch_sm|2 years ago|reply
Very nice! Auto generating type information from looking at permutations of values is hard though. Q: Does this handle optional values? Also, being able to mark string field as "enums" and then collecting the possible values instead of just typing it as "string" would be mega handy.
[+] mrmagoo2|2 years ago|reply
It doesn't have any way of determining which values are optional, so it doesn't make that distinction. Hear you on the enums, I'll take another look at what's possible without adding overhead.
[+] RileyJames|2 years ago|reply
Amazing. I’ve often wished this would exist. Thank you.

It was always my step 1 towards Xxx. Keen to know what directions you were thinking?

I’d love to see more remixing on top of API’s websites typically only expose for their own use.

[+] mrmagoo2|2 years ago|reply
For sure, there are a few tools out there like Requestly to change API behaviour, but it's a frustrating experience. In terms of the direction, planning to keep this simple so I've no plans for additional features.
[+] saran945|2 years ago|reply
Thanks for sharing Chrome extension @mrmagoo2.

It's amazing to see a tool that simplifies the process of generating OpenAPI spec. this is the best showHN this year.

[+] ushakov|2 years ago|reply
Agreed! What would be more awesome though is if it could generate OpenAPI spec from existing HAR files
[+] jtbayly|2 years ago|reply
This looks very useful, but what do I do with the discovered data?

Suppose I have a site that runs a search that I want to be able to automate. However, instead of sending the search term in the URL, it updates live (presumably via some API call).

Now suppose I need a one-click solution to be able to open that page and run a specific search.

Is there another Chrome plugin that would allow me to use this API data to make that happen?

[+] jpmonette|2 years ago|reply
Had in mind to build something like this for quite some time to quickly explore undocumented APIs - looking forward to see your progress!
[+] HanClinto|2 years ago|reply
Okay, this is wonderful. Love it already!!

Sometimes I click on a path parameter and it doesn't "create" it, even though there are several other examples in the list. Not sure if it's a bug, or something I'm doing wrong.

Overall, this is an absolutely wonderful tool and I've wanted something like this for a long time. Incredibly useful, thank you!!

[+] mrmagoo2|2 years ago|reply
That sound like a bug, I need to test that feature more thoroughly. Thanks for reporting.
[+] pbnjay|2 years ago|reply
Damn I literally built a really similar tool myself using HAR files just a couple weeks ago! Yours is way more polished though, nice work.

I have a lot of ideas in this space (some PoCs), and I've been starting to scope out a company around them. Would love to chat to see if there's any shared opportunity for both of us!

[+] ushakov|2 years ago|reply
The problem with this type of tools is that they only produce specs based on infos they can get.

The spec produced will be incomplete (missing paths, methods, response variants, statuses). For that you should use a framework like Fastify, NestJS, tsoa, FastAPI, which have built-in OpenAPI support.

Can be very valuable for reverse-engineering though :)

[+] hubraumhugo|2 years ago|reply
Really cool, we're using a similar technique at Kadoa to auto-generate scrapers for any website. Analyzing network calls to find the desired data in API responses is one of the frist things we do before starting to process the DOM.