WordPress.org

WordPRess GSoC Trac

Opened 11 months ago

Closed 7 months ago

#264 closed enhancement (fixed)

Complete design documents for entities and collections

Reported by: rmccue Owned by: rmccue
Milestone: 2013 Final (1.0) Priority: normal
Component: JSON REST API Keywords:
Cc: prettyboymp

Description

Tracking ticket for week 2.

Change History (17)

comment:1 rmccue11 months ago

  • Type changed from defect to enhancement

comment:2 rmccue11 months ago

See r1977, r1978 and r1979

I've decided to go with using JSON Schema for the most part. I'm going to remove most of the ABNF syntax, except for field values that need to be parsed (e.g. the routes in the Index entity). I'm happy to write ABNF, but it's much too verbose for data that's already in a structured, parsable form.

comment:3 rmccue11 months ago

The JSON Schema can also validate our endpoints automatically via validators like http://json-schema-validator.herokuapp.com/

For testing, we may even be able to use a PHP-based validator to test our endpoints against the schema, which is a bonus.

comment:4 rmccue11 months ago

See r1981, r1982 for more JSON Schema improvements.

I'm debating whether to change comment_status and ping_status to booleans. Both are usually either open or closed, but it's possible that a plugin could change this. Any thoughts on this?

comment:5 rmccue11 months ago

Separated schema into a full JSON Schema document in r1983, cleaned it up slightly in r1989.

Rewrote most of the specification in r1984. Technical changes in r1985, r1986, r1987, r1990.

Index, Post, User and Metadata entities are now fully specified. Still needed are specifications for the Comment, Media, Taxonomy, Term and Entity Meta entities. Also need specifications for Media Collections, Taxonomy Collections, Term Collections and User Collections.

comment:6 rmccue11 months ago

In 1991:

Split changelog into separate file

See #264

comment:7 bpetty10 months ago

  • Milestone set to 2013 Midterm (Beta)

comment:8 follow-ups: prettyboymp10 months ago

Voce has recently been working on an implementation of a JSON API for WordPress, Thermal API. While working on it, there were some questions we had to work through and I thought it would help to bring these up early. I figure some of our solutions could probably be better but hope that they could also help contribute some insight.

What is the plan to deal with some clients (javascript) not being able to handle big int accurately?

What is the plan for handling shortcodes such as galleries and embeds? Is the client responsible for parsing these out for rendering? Will the provider provide implementation details within the entity?

Will it be possible for the client to catch internal links within content and get back the corresponding data from the provider?

Will it be possible for the client choose the proper image size for embedded images?

comment:9 in reply to: ↑ 8 ; follow-up: bpetty10 months ago

Replying to prettyboymp:

What is the plan to deal with some clients (javascript) not being able to handle big int accurately?

For that matter, PHP doesn't handle BIGINT either on 32-bit servers, but WP still used the PHP int type to represent row IDs on those systems anyway. On 32-bit servers, the PHP max integer value is 2 billion - far below Javascript's exact limitation of 9,007,199,254,740,992. For things like comment counts, we can still get useful results higher than that, just not exact.

Since we're talking about JSON, the only real solution to this is to use strings instead of numbers if we were honestly worried about accurately representing numbers above 9 quadrillion. However, I really don't think we are - especially considering the case of 32-bit PHP, and I'm fairly certain it would end up being more of a headache implementing clients in various languages (not just JS) that are required to do string conversions on those values everywhere.

It's also worth noting that the WP.com REST API represents these as numbers as well.

There might be a very select few values where we might care, and in those cases, we could still represent just those values with strings if necessary. Are there any you're concerned about specifically? I really can't think of any myself.

What is the plan for handling shortcodes such as galleries and embeds? Is the client responsible for parsing these out for rendering? Will the provider provide implementation details within the entity?

I realize that you might be thinking in the context of a valid editor or author using 3rd party publishing tools that need access to this for writing or editing post content. This is not currently within the scope of this Summer of Code project though (we're not dealing with authentication just yet).

All publicly accessible endpoints will never contain raw post content data that hasn't had a chance to process shortcodes (this could be an issue with regard to security).

It would still be interesting to see ideas for possibly exposing the list of supported shortcodes and endpoints that could process those codes for 3rd party publishers, and how that might be implemented though.

Will it be possible for the client to catch internal links within content and get back the corresponding data from the provider?

I assume you're referring to a self-describing REST API, and this was briefly discussed on the wp-hackers mailing list, but I don't think any decision was made. For now, it's not on the table.

Personally, I really don't see this feature hit many REST APIs, and in the ones it does, it's not even used most of the time. It just ends up being a waste of bandwidth since these APIs frequently end up using a client library that already knows the endpoints with a versioned API, and the additional per-object URIs aren't even necessary. That's just my opinion though.

However, the API is already planning on at least listing the available routes that the given WP installation supports (this should mostly come in handy for testing an installation's support for certain custom plugin endpoints). Since I'm sure that plugin API endpoints will need to be versioned in some way on their own apart from core endpoints, I could see how this *might* still be helpful, but it still doesn't remove the need for a plugin API endpoint version indicator anyway (for supported properties), and you can still always check that instead.

comment:10 in reply to: ↑ 9 ; follow-up: prettyboymp10 months ago

For things like comment counts, we can still get useful results higher than that, just not exact.

I was mostly concerned about comment ID's being accurate when dealing with replies. I realize that PHP on 32-bit servers can't process the BIGINT either, but I also hope that no site getting that many comments are running on a 32-bit server either.

Will it be possible for the client to catch internal links within content and get back the corresponding data from the provider?

I assume you're referring to a self-describing REST API, and this was briefly discussed on the wp-hackers mailing list, but I don't think any decision was made. For now, it's not on the table.

I was actually referring to being able to parse links that are within the content of a post that references another post, page, or taxonomy term on the site and convert those to a second API request that could get the represented entities to render. This doesn't have to be a self describing API in order to work. In thermal, we're exposing the rewrite rules in the API, which allows any same origin links to be converted into filters which the /posts/ endpoint accepts.

comment:11 in reply to: ↑ 10 bpetty10 months ago

Replying to prettyboymp:

For things like comment counts, we can still get useful results higher than that, just not exact.

I was mostly concerned about comment ID's being accurate when dealing with replies. I realize that PHP on 32-bit servers can't process the BIGINT either, but I also hope that no site getting that many comments are running on a 32-bit server either.

Yeah, comments and post meta are the typically highest table IDs in the DB (afaik). I hope no-one has over 9 quadrillion comments either. If you look at WP.com stats, and consider 10:1 ratio on spam (based on Akismet stats), WP.com has seen somewhere around 10 billion comments (including spam) over the entire lifetime of WP.com since 2005/2006. But when you realize that WP.com is multisite, every blog also has it's own comments table, so they still probably don't pop up over maybe 10 to 50 million even for the most popular blogs.

Basically what I'm saying though is that I don't think this is an issue we need to worry about. Even if WP.com stored all comments on every blog in the same table (they don't), we'd still be able to use integers reliably for centuries to come through the REST API without problems.

Will it be possible for the client to catch internal links within content and get back the corresponding data from the provider?

I assume you're referring to a self-describing REST API, and this was briefly discussed on the wp-hackers mailing list, but I don't think any decision was made. For now, it's not on the table.

I was actually referring to being able to parse links that are within the content of a post that references another post, page, or taxonomy term on the site and convert those to a second API request that could get the represented entities to render. This doesn't have to be a self describing API in order to work. In thermal, we're exposing the rewrite rules in the API, which allows any same origin links to be converted into filters which the /posts/ endpoint accepts.

Ah, this is something completely different than what I thought you were talking about. I just looked over the Thermal API docs for the rewrite rules, and now understand what you mean. This is pretty cool and very useful, and maybe something to consider integrating in the future. Worst case scenario though it could still remain in the Thermal API plugin as an extension of the core REST API (whenever it gets around to being merged into core).

comment:12 prettyboymp10 months ago

  • Cc prettyboymp added

comment:13 in reply to: ↑ 8 rmccue10 months ago

Sorry for just catching this, GSoC Trac didn't auto-CC me in.

Replying to prettyboymp:

What is the plan to deal with some clients (javascript) not being able to handle big int accurately?

As Bryan noted, this isn't really a huge issue, since WP itself can't handle this. As noted in the PHP docs:

The size of an integer is platform-dependent, although a maximum value of about two billion is the usual value (that's 32 bits signed). 64-bit platforms usually have a maximum value of about 9E18.

Both of those are less than Javascript's limit, which is ~1E53 as per Twitter's Snowflake announcement. Should we ever get into a situation where that needs to be handled (from memory, there was a recent php-internals discussion about supporting native 64-bit integers), we can transition the API to providing it as a string as well, as per Twitter's solution.

What is the plan for handling shortcodes such as galleries and embeds? Is the client responsible for parsing these out for rendering? Will the provider provide implementation details within the entity?

The API will return the filtered content with all shortcodes replaced, however it's conceivable that the authenticated version of the API will also return (e.g.) a content_raw property. This is out-of-scope for now, but see #281 for that topic. This *does* affect the ability to write posts with shortcodes, so it may be handled earlier.

I was actually referring to being able to parse links that are within the content of a post that references another post, page, or taxonomy term on the site and convert those to a second API request that could get the represented entities to render. This doesn't have to be a self describing API in order to work. In thermal, we're exposing the rewrite rules in the API, which allows any same origin links to be converted into filters which the /posts/ endpoint accepts.

The index endpoint reveals almost literally what the internal rewrite rules for the API are, in the form /posts/<id> where anything in angled brackets is a variable that can be replaced with the associated property.

Each entity should also contain relevant links. Just realised I forgot to push the documentation on that, see r2003 and the documentation.

Will it be possible for the client choose the proper image size for embedded images?

At this point, there's no functionality scoped for this - basically, the content property will be exactly what the_content() gives in a template. IMO, this is more of a presentation-layer thing, but I can see the need for it. How does Thermal work with this?

comment:14 prettyboymp10 months ago

Will it be possible for the client choose the proper image size for embedded images?

At this point, there's no functionality scoped for this - basically, the content property will be exactly what the_content() gives in a template. IMO, this is more of a presentation-layer thing, but I can see the need for it. How does Thermal work with this?

For internal images, we're returning the available image sizes for each media item of the post, https://github.com/voceconnect/thermal-api#internal-image-media-item-json-schema. The media items contains embed data for each media item, like shortcodes, or images within the content.

comment:15 bpetty7 months ago

  • Milestone changed from 2013 Midterm (Beta) to 2013 Final (1.0)
  • Priority changed from major to normal

Pushing to final midterm for additional docs on remaining entities (like media, see #272).

comment:16 rmccue7 months ago

In 2355:

Update the schema to match latest changes

See #264

comment:17 rmccue7 months ago

  • Resolution set to fixed
  • Status changed from new to closed

Closing as fixed.

Note: See TracTickets for help on using tickets.