User:BPirkle (WMF)/Stuff/ASC2021

From Meta, a Wikimedia project coordination wiki

API Specifications Conference 2021 Notes/Thoughts[edit]

Together with Alex Paskulin, I attended the API Specifications Conference 2021. The following are some notes and takeaways. Some of these are my own, some are quotes from various speakers, some are my (perhaps incorrect) paraphrase of what speakers intended to communicate, and some are shamelessly stolen from Alex.

Recordings are available on YouTube

Tuesday Keynote (Mandy Whaley and Yina Arenas, Microsoft)[edit]

The team at Microsoft intentionally avoided using project-specific naming in their APIs, because particular projects come and go and they wanted to avoid naming churn. Might be good advice for us in some cases. For example suppose we'd used the name "okapi" in an exposed API. I wonder if, in our case, project-specific names would be appropriate at the namespace level but not lower. Something to think about.

Microsoft (at least at the Azure scale) considers breaking changes to APIs as not viable, and requires in-place deprecation.

While APIs are technically about machines talking to machines, they're built by and for humans.

  • Developer-focused experience
  • People both building and using the API
  • Embrace customer-centric mindset
  • Customer-driven development

Speaker (summarized): "Hypothesis Progression Framework": make a hypothesis about what people want or how they interact with APIs, then validate that with actual people, then act on those findings, then circle back to the people to make sure what you built actually works for them.

Bill: I'm really curious what that approach would reveal about a unified Wikimedia API. It intuitively sounds like something we should have, but I wonder if actual humans want/need such a thing or if their usage mirrors our current per-project wiki-centric APIs. I'm honestly not sure whether volunteers see themselves as part of the "Wikimedia community" as a whole, or whether they seem themselves as part of the (for example) Arabic Wikipedia community.

Alex: yeah per-project APIs might be more realistic, but it’s a good hypothesis to investigate regardless

The panelists were asked about how to prevent duplicated APIs. The answer was that the governance team actively works to connect teams working on related projects. This, of course, requires a team continuously doing this work and staying aware of what's going on across the org, rather than a fire-and-forget team that creates some content and pieces of technology then expects it to be sufficient on its own. Other speakers in later panels mentioned that in their orgs, there tend to be a small number of people who are permanently on the governance team, plus a number of people from across the org that serve on a rotating basis.

Book that was mentioned: The Customer-Driven Playbook

Contextualized API Specifications (Boris Vernoff, ADP)[edit]

ADP grew via acquisitions, leading to a heterogeneous collection of incompatible technologies. They used APIs to stitch all that together. But that process resulted in many APIs of considerable complexity. Many API users are interested in only a subset.

Some data may be prohibited by law in some countries but desirable in others. This causes some API users to be unable to use certain APIs (or certain API endpoints, or even certain features of certain endpoints). So different API users may have different views of what is available from the APIs, depending on where they're located. This probably has a larger effect in their payroll/benefits space than in ours, but I wonder if we'll ever have similar issues in some countries. Or maybe we already have these issues and I'm not aware of it.

From the speaker's slide, to illustrate the kinds of things they had to think through:

SOLUTION - REQUIREMENTS

  • Context-aware Developer's portal
    • Partner - choice of contexts
    • Consumer - subscription-based context
  • Context-specific API documentation
    • Contextualized schemas w/ tailored examples
    • Selective operation support
    • Customized business rules
  • API Registry
    • Data-driven API specification generation
    • User access /entitlements management
    • Infrastructure integrated

insert link to API Portal Access Diagram once it becomes available

ADP has a wizard-ish interface where API consumers can select what sort of things they're interested in and their API portal will filter the canonical OpenAPI specification and create a contextualized one, including example API calls (which are taken from separate .json files).

This contextual schema spans all the various available products. So in our case it might include a mix of APIs provided by core, services, etc. In this way, an API consumer could see the available data as one big API, when in fact it is actually provided by different pieces of the infrastructure.

The wizard builds a suggested contextual schema, but the user can add/remove stuff manually to customize as desired before custom schema generation. It all sounds pretty sophisticated - building the thing that creates these contextualized schemas must have been a significant effort.

We brought OpenAPI Docs into our service catalog. Now what? (Shai Sachs & Zoe Song, Wayfair)[edit]

They tried several approaches for generating API specs:

  • Schema-first
    • required a lot of fine-grained knowledge of the OpenAPI spec
    • available editing tools helped, but were not easy to use either
  • Google docs
    • easy collaboration
    • low barrier to entry
    • no real way to validate, easy to create invalid specs
  • Code-first
    • support for this is built into many frameworks
    • API specs are always in-sync with code
    • rapid prototyping is possible via scaffolded code generation

While they still use multiple approaches, they found code-first to generally be best. However, they do have to meet teams "where they are", so Google Docs still get used in some cases.

Finding Ways to Measure the Complexity of an API Design (Stephen Mizell, API Consultant)[edit]

The info covered in this talk is also available here

Some notable takeaways and discussion:

Speaker: Would anyone write tests to cover every variation of a highly complex schema? What happens when you get a schema with thousands of variations? How would we know if one of these schema variations caused a bug?

Alex: less complex APIs = better docs

Bill: There may be a DEI aspect. Less complexity = more approachable for someone who didn't have opportunity for a formal technical education

We wrote an API description, now what (Matthew Reinbold, Postman)[edit]

Great slide titled "The Most Necessary API Testing Is Also Most Dull" https://twitter.com/libel_vox/status/1442929474266284036/photo/1

insert slide "Using API Descriptions in Design Guidelines" when it becomes available

Wednesday Keynote[edit]

Speaker (paraphrased): "Specs are communication"

Bill: the "specs as communication" point has implications on our concerns around OpenAPI. Even if OpenAPI has its flaws, this conference has reinforced my opinion that it is sufficiently dominant in the broader technical community that if we don't talk it, we risk being too insular, having difficulty interacting with non-WMF developers, and fostering a not-invented-here mentality.

Bill: Interesting discussion on API security and specs. "You can't secure what you don't know about" and "specs are the map to the API catalog" (paraphrased) make sense to me. From a security point of view, it makes sense that you'd (1) require a spec, (2) require that spec to be discoverable, and (3) require the implementation to match the spec. Digging in code to see if an API implementation has security issues is probably a necessary step, but it should probably be toward the end of a security review and not the beginning.

OpenAPI Workflows: describing sequences of operations (David O'Neill, APImetrics)[edit]

Bill: someone in chat made the point that specifying sequences is helpful not only for API consumers, but also very helpful for API testing. This is still a formative area in the spec - if I'm understanding correctly it is just at the proposal stage, but might be something we would be interested in being involved with.

Bill: If we wanted to get involved with the workflow specification, they have meetings at 9am Pacific time every 2nd Wednesday (with the next on Oct 13), and also a Slack and Github. david@apimetrics.com is the person to contact for more info.


Leveraging API documentation to deliver reliable API integrations (Jose Haro Peralta, Algorizm Ltd)[edit]

Alex: This speaker is doing a demo of testing a spec against API behavior with dredd, "A Language-agnostic HTTP API Testing Tool for early stage development"

Alex: He calls out that these tests aren’t exhaustive, and he combines it with schemathesis, "A modern API testing tool for web applications built with Open API and GraphQL specifications"

Alex: Mock server for developing client code against a spec: prism, "Turn any OpenAPI2/3 and Postman Collection file into an API server with mocking, transformations and validations."

Alex: For GraphQL: graphql-faker, "Mock or extend your GraphQL API with faked data. No coding required"

Speaker: Most common API validation errors:

  1. required properties not present
  2. additional properties present,
  3. non-nullable properties being null,
  4. status codes not defined being present”

note to self: this sounds like it was a great session, and I should watch it when the recordings are published

Governing APIs at Scale (Tim Burks, Google)[edit]

This session might be interesting to Seve once recordings are posted. There is a lot of good info from a product roadmap and organizational perspective

Google API Lifecycle Model:

(imagine these in a big circle):

  • Design
  • Develop
  • Secure
  • Publish
  • Scale
  • Monitor
  • Analyze
  • Monetize

12 Requirements for an API Governance Platform:

  1. Inclusion
  2. Shared Language (Bill: by this they meant "vocabulary" not "language" as in English vs Spanish)
  3. Revision Histories
  4. Metadata
  5. Lifecycle Model
  6. Search
  7. Style Guides
  8. Scoring
  9. Policies and Controls
  10. Integrations
  11. Open Source
  12. Enterprise-Readiness

Google API Improvement Proposals site No, they didn't typo "api" :)

Bill: I like the discussion on "scoring" vs hard "pass/fail" that's occurring in session chat. The idea is that an API might pass CI even if it doesn't perfectly pass every requirement, if its overall score is good enough. Pass/fail might be reasonable for new APIs, but would be hard for legacy APIs because it might be too high a bar. I wonder how adaptable our CI system is to that sort of concept, and if it'd be possible to configure a required score on a per-repository basis. (It might be good to meet legacy APIs where they are so we don't interrupt development/deployment, then bump up the required score over time to encourage a trend toward compliance without being authoritarian.)

Alex: totally agree. scoring is a better incentive for improvement as well

Essential Ingredients for a Successful API Program (Jason Harmon, Stoplight)[edit]

Bill: per the speaker, the #1 most important step for introducing an API program is "Universal vocabulary". He recommends "Business Capability Modeling", in which you "define capabilities with nouns that customers would understand". And in addition to defining the capabilities, define the relationships among them. This might sound mostly applicable to the "Universal Wikimedia API" encyclopedia API idea that some people (including Seve recently) have talked about. But we ran into some issues with this even in Core Rest, when we had trouble with words like "article" vs "page".

Style Guide + Automation[edit]

Good: document API style guide in a central place that everyone learns and API reviewers look for. Share publicly!

  • Difficult to be 100% accurate in reviews
  • Convention checks are the #1 time and quality bottleneck

Better: automate style guide rules where possible

  • Spectral: Open Source OpenAPI linter, with built-in and custom rulesets
  • Lint at design-time (Stoplight plantform) and in CI/CD (Spectral/OSS)
  • By automating most of your style guide, API reviews should be much more about substance than form

adidas API guidlines

Alex: “we’re here to make you look cool. you don’t have to know it all” <-- love that as a message from api platform governance to teams

Speaker: "Platform transformations are hard because of the culture change, not the technology" Speaker: "Find the API thought leaders in the org, regardless of the org chart" Speaker: "auth should be the first thing on the page"

So you think you understand JSON Schema? (Ben Hutton, Postman/JSON Schema)[edit]

Bill: JSON Schema looks useful, but it is definitely its own skill. Asking people to write JSON Schema would be helpful for testing/validation, but asking people to write these for their endpoints would be tough - I know it'd take me some time to figure it out for even moderately complicated response bodies. There's probably tooling to help people, though...

misc stuff[edit]

  • stoplightio/spectral A flexible JSON/YAML linter for creating automated style guides, with baked in support for OpenAPI v2 & v3.
  • docs.pact.io Fast and reliable testing for your APIs and microservices during development. Safety during deployment.
  • APIs.guru Wikipedia for Web APIs (or at least, that's what they claim to be...)