Rendering Markdown with the GitHub Markdown API. I wanted to convert the Markdown used in my TILs to HTML, using the exact same configuration as GitHub does. GitHub has a whole load of custom extensions for things like tables and syntax highlighting (see issue 17). It turns out they have an API for this! GitHub - ClydeDz/markdown-to-api: Generate JSON files from Markdown files which can then be consumed by your front-end app. Basically, a static API generator. We would like to show you a description here but the site won’t allow us. Markdown-it demo. Html xhtmlOut breaks linkify typographer highlight CommonMark strict clear permalink. Fork me on GitHub.
Python-Markdown includes an API for extension writers to plug their own custom functionality and syntax into theparser. An extension will patch into one or more stages of the parser:
- Preprocessors alter the source before it is passed to the parser.
- Block Processors work with blocks of text separated by blank lines.
- Tree Processors modify the constructed ElementTree
- Inline Processors are common tree processors for inline elements, such as
*strong*
. - Postprocessors munge of the output of the parser just before it is returned.
The parser loads text, applies the preprocessors, creates and builds an ElementTree object from theblock processors and inline processors, renders the ElementTree object as Unicode text, and then then applies thepostprocessors.
There are classes and helpers provided to ease writing your extension. Each part of the API is discussed in itsrespective section below. Additionally, you can walk through the Tutorial on Writing Extensions; look atsome of the Available Extensions and their source code. As always, you may report bugs, askfor help, and discuss various other issues on the bug tracker.
Phases of processing¶
Preprocessors¶
Preprocessors munge the source text before it is passed to the Markdown parser. This is an excellent place to clean upbad characters or to extract portions for later processing that the parser may otherwise choke on.
Preprocessors inherit from markdown.preprocessors.Preprocessor
and implement a run
method, which takes a singleparameter lines
. This parameter is the entire source text stored as a list of Unicode strings, one per line. run
should return its processed list of Unicode strings, one per line.
Example¶
This simple example removes any lines with ‘NO RENDER’ before processing:
Usages¶
Some preprocessors in the Markdown source tree include:
Class | Kind | Description |
---|---|---|
NormalizeWhiteSpace | built-in | Normalizes whitespace by expanding tabs, fixing r line endings, etc. |
HtmlBlockPreprocessor | built-in | Removes html blocks from the text and stores them for later processing |
ReferencePreprocessor | built-in | Removes reference definitions from text and stores for later processing |
MetaPreprocessor | extension | Strips and records meta data at top of documents |
FootnotesPreprocessor | extension | Removes footnote blocks from the text and stores them for later processing |
Block Processors¶
A block processor parses blocks of text and adds new elements to the ElementTree
. Blocks of text, separated fromother text by blank lines, may have a different syntax and produce a differently structured tree than other Markdown. Block processors excel at code formatting, equation layouts, and tables.
Github Markdown Html
Block processors inherit from markdown.blockprocessors.BlockProcessor
, are passed md.parser
on initialization, andimplement both the test
and run
methods:
test(self, parent, block)
takes two parameters:parent
is the parentElementTree
element andblock
is a single, multi-line, Unicode string of the current block.test
, often a regular expression match, returns a true value if the block processor’srun
method should be called to process starting at that block.run(self, parent, blocks)
has the sameparent
parameter astest
; andblocks
is the list of all remaining blocks in the document, starting with theblock
passed totest
.run
may returnFalse
(notNone
) to signal failure, meaning that it did not process the blocks after all. On success,run
is expected topop
one or more blocks from the front ofblocks
and attach new nodes toparent
.
Crafting block processors is more involved and flexible than the other processors, involving controlling recursiveparsing of the block’s contents and managing state across invocations. For example, a blank line is allowed inindented code, so the second invocation of the inline code processor appends to the element tree generated by theprevious call. Other block processors may insert new text into the blocks
list, signal to future calls of itself,and more.
To make writing these complex beasts more tractable, three convenience functions have been provided by theBlockProcessor
parent class:
lastChild(parent)
returns the last child of the given element orNone
if it has no children.detab(text)
removes one level of indent (four spaces by default) from the front of each line of the given multi-line, text string, until a non-blank line is indented less.looseDetab(text, level)
removes multiple levels of indent from the front of each line oftext
but does not affect lines indented less.
Also, BlockProcessor
provides the fields self.tab_length
, the tab length (default 4), and self.parser
, thecurrent BlockParser
instance.
BlockParser¶
BlockParser
, not to be confused with BlockProcessor
, is the class used by Markdown to cycle through all theregistered block processors. You should never need to create your own instance; use self.parser
instead.
The BlockParser
instance provides a stack of strings for its current state, which your processor can push withself.parser.set(state)
, pop with self.parser.reset()
, or check the the top state withself.parser.isstate(state)
. Be sure your code pops the states it pushes.
The BlockParser
instance can also be called recursively, that is, to process blocks from within your blockprocessor. There are three methods:
parseDocument(lines)
parses a list of lines, each a single-line Unicode string, returning a completeElementTree
.parseChunk(parent, text)
parses a single, multi-line, possibly multi-block, Unicode stringtext
and attaches the resulting tree toparent
.parseBlocks(parent, blocks)
takes a list ofblocks
, each a multi-line Unicode string without blank lines, and attaches the resulting tree toparent
.
For perspective, Markdown calls parseDocument
which calls parseChunk
which calls parseBlocks
which calls yourblock processor, which, in turn, might call one of these routines.
Example¶
This example calls out important paragraphs by giving them a border. It looks for a fence line of exclamation pointsbefore and after and renders the fenced blocks into a new, styled div
. If it does not find the ending fence line,it does nothing.
Our code, like most block processors, is longer than other examples:
Start with this example input:
The fenced text adds one node with two children to the tree:
div
, with astyle
attribute. It renders as<div>...</div>
p
with textFirst paragraph of wrapped text.
p
with textSecond Paragraph of **wrapped** text
. The conversion to a<strong>
tag will happen when running the inline processors, which will happen after all of the block processors have completed.
The example output might display as follows:
A regular paragraph of text.
First paragraph of wrapped text.
Second Paragraph of wrapped text.
Another regular paragraph of text.
Usages¶
Some block processors in the Markdown source tree include:
Class | Kind | Description |
---|---|---|
HashHeaderProcessor | built-in | Title hashes (# ), which may split blocks |
HRProcessor | built-in | Horizontal lines, e.g., --- |
OListProcessor | built-in | Ordered lists; complex and using state |
Admonition | extension | Render each Admonition in a new div |
Tree processors¶
Tree processors manipulate the tree created by block processors. They can even create an entirely new ElementTreeobject. This is an excellent place for creating summaries, adding collected references, or last minute adjustments.
A tree processor must inherit from markdown.treeprocessors.Treeprocessor
(note the capitalization). A tree processormust implement a run
method which takes a single argument root
. In most cases root
would be anxml.etree.ElementTree.Element
instance; however, in rare cases it could be some other type of ElementTree object.The run
method may return None
, in which case the (possibly modified) original root
object is used, or it mayreturn an entirely new Element
object, which will replace the existing root
object and all of its children. It isgenerally preferred to modify root
in place and return None
, which avoids creating multiple copies of the entiredocument tree in memory.
For specifics on manipulating the ElementTree, see Working with the ElementTree below.
Example¶
A pseudo example:
Usages¶
The core InlineProcessor
class is a tree processor. It walks the tree, matches patterns, and splits and createsnodes on matches.
Additional tree processors in the Markdown source tree include:
Class | Kind | Description |
---|---|---|
PrettifyTreeprocessor | built-in | Add line breaks to the html document |
TocTreeprocessor | extension | Builds a table of contents from the finished tree |
FootnoteTreeprocessor | extension | Create footnote div at end of document |
FootnotePostTreeprocessor | extension | Amend div created by FootnoteTreeprocessor with duplicates |
Inline Processors¶
Inline processors, previously called inline patterns, are used to add formatting, such as **emphasis**
, by replacinga matched pattern with a new element tree node. It is an excellent for adding new syntax for inline tags. Inlineprocessor code is often quite short.
Inline processors inherit from InlineProcessor
, are initialized, and implement handleMatch
:
__init__(self, pattern, md=None)
is the inherited constructor. You do not need to implement your own.pattern
is the regular expression string that must match the code block in order for thehandleMatch
method to be called.md
, an optional parameter, is a pointer to the instance ofmarkdown.Markdown
and is available asself.md
on theInlineProcessor
instance.
handleMatch(self, m, data)
must be implemented in allInlineProcessor
subclasses.m
is the regular expression match object found by thepattern
passed to__init__
.data
is a single, multi-line, Unicode string containing the entire block of text around the pattern. A block is text set apart by blank lines.- Returns either
(None, None, None)
, indicating the provided match was rejected or(el, start, end)
, if the match was successfully processed. On success,el
is the element being added the tree,start
andend
are indexes indata
that were “consumed” by the pattern. The “consumed” span will be replaced by a placeholder. The same inline processor may be called several times on the same block.
Inline Processors can define the property ANCESTOR_EXCLUDES
which is either a list or tuple of undesirable ancestors.The processor will be skipped if it would cause the content to be a descendant of one of the listed tag names.
Convenience Classes¶
Convenience subclasses of InlineProcessor
are provide for common operations:
SimpleTextInlineProcessor
returns the text ofgroup(1)
of the match.SubstituteTagInlineProcessor
is initialized asSubstituteTagInlineProcessor(pattern, tag)
. It returns a new elementtag
wheneverpattern
is matched.SimpleTagInlineProcessor
is initialized asSimpleTagInlineProcessor(pattern, tag)
. It returns an elementtag
with a text field ofgroup(2)
of the match.
Example¶
This example changes --strike--
to <del>strike</del>
.
Use this input example:
The example output might display as follows:
First line of the block.This is strike one.This is strike two.End of the block.
On the first call to
handleMatch
m
will be the match for--strike one--
data
will be the string:First line of the block.nThis is --strike one--.nThis is --strike two--.nEnd of the block.
Because the match was successful, the region between the returned
start
andend
are replaced with aplaceholder token and the new element is added to the tree.On the second call to
handleMatch
m
will be the match for--strike two--
data
will be the stringFirst line of the block.nThis is klzzwxh:0000.nThis is --strike two--.nEnd of the block.
Note the placeholder token klzzwxh:0000
. This allows the regular expression to be run against the entire block,not just the the text contained in an individual element. The placeholders will later be swapped back out for theactual elements by the parser.
Actually it would not be necessary to create the above inline processor. The fact is, that example is not very DRY(Don’t Repeat Yourself). A pattern for **strong**
text would be almost identical, with the exception that it wouldcreate a strong
element. Therefore, Markdown provides a number of generic InlineProcessor
subclasses that canprovide some common functionality. For example, strike could be implemented with an instance of theSimpleTagInlineProcessor
class as demonstrated below. Feel free to use or extend any of the InlineProcessor
subclasses found at markdown.inlinepatterns
.
Usages¶
Here are some convenience functions and other examples:
Class | Kind | Description |
---|---|---|
AsteriskProcessor | built-in | Emphasis processor for handling strong and em matches inside asterisks |
AbbrInlineProcessor | extension | Apply tag to abbreviation registered by preprocessor |
WikiLinksInlineProcessor | extension | Link [[article names]] to wiki given in metadata |
FootnoteInlineProcessor | extension | Replaces footnote in text with link to footnote div at bottom |
Github Markdown Api Documentation
Patterns¶
In version 3.0, a new, more flexible inline processor was added, markdown.inlinepatterns.InlineProcessor
. Theoriginal inline patterns, which inherit from markdown.inlinepatterns.Pattern
or one of its children are stillsupported, though users are encouraged to migrate.
Comparison with new InlineProcessor
¶
The new InlineProcessor
provides two major enhancements to Patterns
:
Inline Processors no longer need to match the entire block, so regular expressions no longer need to start with
r'^(.*?)'
and end withr'(.*?)%'
. This runs faster. The returned match object will only contain what is explicitly matched in the pattern, and extension pattern groups now start withm.group(1)
.The
handleMatch
method now takes an additional input calleddata
, which is the entire block under analysis, not just what is matched with the specified pattern. The method now returns the element and the indexes relative todata
that the return element is replacing (usuallym.start(0)
andm.end(0)
). If the boundaries are returned asNone
, it is assumed that the match did not take place, and nothing will be altered indata
.This allows handling of more complex constructs than regular expressions can handle, e.g., matching nestedbrackets, and explicit control of the span “consumed” by the processor.
Inline Patterns¶
Inline Patterns can implement inline HTML element syntax for Markdown such as *emphasis*
or[links](http://example.com)
. Pattern objects should be instances of classes that inherit frommarkdown.inlinepatterns.Pattern
or one of its children. Each pattern object uses a single regular expression andmust have the following methods:
getCompiledRegExp()
:Returns a compiled regular expression.
handleMatch(m)
:Accepts a match object and returns an ElementTree element of a plain Unicode string.
Inline Patterns can define the property ANCESTOR_EXCLUDES
with is either a list or tuple of undesirable ancestors.The pattern will be skipped if it would cause the content to be a descendant of one of the listed tag names.
Note that any regular expression returned by getCompiledRegExp
must capture the whole block. Therefore, they shouldall start with r'^(.*?)'
and end with r'(.*?)!'
. When using the default getCompiledRegExp()
method provided inthe Pattern
you can pass in a regular expression without that and getCompiledRegExp
will wrap your expression foryou and set the re.DOTALL
and re.UNICODE
flags. This means that the first group of your match will be m.group(2)
as m.group(1)
will match everything before the pattern.
For an example, consider this simplified emphasis pattern:
As discussed in Integrating Your Code Into Markdown, an instance of this class will need to be provided toMarkdown. That instance would be created like so:
Postprocessors¶
Postprocessors munge the document after the ElementTree has been serialized into a string. Postprocessors should beused to work with the text just before output. Usually, they are used add back sections that were extracted in apreprocessor, fix up outgoing encodings, or wrap the whole document.
Postprocessors inherit from markdown.postprocessors.Postprocessor
and implement a run
method which takes a singleparameter text
, the entire HTML document as a single Unicode string. run
should return a single Unicode stringready for output. Note that preprocessors use a list of lines while postprocessors use a single multi-line string.
Example¶
Here is a simple example that changes the output to one big page showing the raw html.
Usages¶
Some postprocessors in the Markdown source tree include:
Class | Kind | Description |
---|---|---|
raw_html | built-in | Restore raw html from htmlStash , stored by HTMLBlockPreprocessor , and code highlighters |
amp_substitute | built-in | Convert ampersand substitutes to & ; used in links |
unescape | built-in | Convert some escaped characters back from integers; used in links |
FootnotePostProcessor | extension | Replace footnote placeholders with html entities; as set by other stages |
Working with the ElementTree¶
As mentioned, the Markdown parser converts a source document to an ElementTree object beforeserializing that back to Unicode text. Markdown has provided some helpers to ease that manipulation within the contextof the Markdown module.
First, import the ElementTree module:
Sometimes you may want text inserted into an element to be parsed by [Inline Patterns][]. In such a situation, simplyinsert the text as you normally would and the text will be automatically run through the Inline Patterns. However, ifyou do not want some text to be parsed by Inline Patterns, then insert the text as an AtomicString
.
Here’s a basic example which creates an HTML table (note that the contents of the second cell (td2
) will be runthrough Inline Patterns latter):
You can also manipulate an existing tree. Consider the following example which adds a class
attribute to <a>
elements:
For more information about working with ElementTree see the ElementTreeDocumentation.
Working with Raw HTML¶
Occasionally an extension may need to call out to a third party library which returns a pre-made stringof raw HTML that needs to be inserted into the document unmodified. Raw strings can be stashed for laterretrieval using an htmlStash
instance, rather than converting them into ElementTree
objects. A raw string(which may or may not be raw HTML) passed to self.md.htmlStash.store()
will be saved to the stash and aplaceholder string will be returned which should be inserted into the tree instead. After the tree isserialized, a postprocessor will replace the placeholder with the raw string. This prevents subsequentprocessing steps from modifying the HTML data. For example,
For the global htmlStash
instance to be available from a processor, the markdown.Markdown
instance mustbe passed to the processor from extendMarkdown and will be available as self.md.htmlStash
.
Integrating Your Code Into Markdown¶
Once you have the various pieces of your extension built, you need to tell Markdown about them and ensure that theyare run in the proper sequence. Markdown accepts an Extension
instance for each extension. Therefore, you will needto define a class that extends markdown.extensions.Extension
and over-rides the extendMarkdown
method. Within thisclass you will manage configuration options for your extension and attach the various processors and patterns to theMarkdown instance.
It is important to note that the order of the various processors and patterns matters. For example, if we replacehttp://...
links with <a>
elements, and then try to deal with inline HTML, we will end up with a mess.Therefore, the various types of processors and patterns are stored within an instance of the markdown.Markdown
classin a Registry. Your Extension
class will need to manipulate those registries appropriately. You may register
instances of your processors and patterns with an appropriate priority, deregister
built-in instances, or replace abuilt-in instance with your own.
extendMarkdown
¶
The extendMarkdown
method of a markdown.extensions.Extension
class accepts one argument:
md
:A pointer to the instance of the
markdown.Markdown
class. You should use this to access theRegistries of processors and patterns. They are found under the following attributes:md.preprocessors
md.inlinePatterns
md.parser.blockprocessors
md.treeprocessors
md.postprocessors
Some other things you may want to access on the
markdown.Markdown
instance are:md.htmlStash
md.output_formats
md.set_output_format()
md.output_format
md.serializer
md.registerExtension()
md.tab_length
md.block_level_elements
md.isBlockLevel()
Warning
With access to the above items, theoretically you have the option to change anything through variousmonkey_patching techniques. However, you should be aware that the various undocumented parts of Markdown maychange without notice and your monkey_patches may break with a new release. Therefore, what you really should bedoing is inserting processors and patterns into the Markdown pipeline. Consider yourself warned!
A simple example:
registerExtension¶
Some extensions may need to have their state reset between multiple runs of the markdown.Markdown
class. Forexample, consider the following use of the Footnotes extension:
Without calling reset
, the footnote definitions from the first document will be inserted into the second document asthey are still stored within the class instance. Therefore the Extension
class needs to define a reset
method thatwill reset the state of the extension (i.e.: self.footnotes = {}
). However, as many extensions do not have a needfor reset
, reset
is only called on extensions that are registered.
To register an extension, call md.registerExtension
from within your extendMarkdown
method:
Then, each time reset
is called on the markdown.Markdown
instance, the reset
method of each registered extensionwill be called as well. You should also note that reset
will be called on each registered extension after it isinitialized the first time. Keep that in mind when over-riding the extension’s reset
method.
Configuration Settings¶
If an extension uses any parameters that the user may want to change, those parameters should be stored inself.config
of your markdown.extensions.Extension
class in the following format:
When implemented this way the configuration parameters can be over-ridden at run time (thus the call to super
). Forexample:
Note that if a keyword is passed in that is not already defined in self.config
, then a KeyError
is raised.
The markdown.extensions.Extension
class and its subclasses have the following methods available to assist in workingwith configuration settings:
getConfig(key [, default])
:Returns the stored value for the given
key
ordefault
if thekey
does not exist. If not set,default
returns an empty string.getConfigs()
:Returns a dict of all key/value pairs.
getConfigInfo()
:Returns all configuration descriptions as a list of tuples.
setConfig(key, value)
:Sets a configuration setting for
key
with the givenvalue
. Ifkey
is unknown, aKeyError
is raised. If theprevious value ofkey
was a Boolean value, thenvalue
is converted to a Boolean value. If the previous valueofkey
isNone
, thenvalue
is converted to a Boolean value except when it isNone
. No conversion takesplace when the previous value ofkey
is a string.setConfigs(items)
:Sets multiple configuration settings given a dict of key/value pairs.
Naming an Extension¶
As noted in the library reference an instance of an extension can be passed directly to markdown.Markdown
. Infact, this is the preferred way to use third-party extensions.
For example:
However, Markdown also accepts “named” third party extensions for those occasions when it is impractical to import anextension directly (from the command line or from within templates). A “name” can either be a registered entrypoint or a string using Python’s dot notation.
Entry Point¶
Entry points are defined in a Python package’s setup.py
script. The script must use setuptools to support entrypoints. Python-Markdown extensions must be assigned to the markdown.extensions
group. An entry point definitionmight look like this:
After a user installs your extension using the above script, they could then call the extension using themyextension
string name like this:
Note that if two or more entry points within the same group are assigned the same name, Python-Markdown will only everuse the first one found and ignore all others. Therefore, be sure to give your extension a unique name.
For more information on writing setup.py
scripts, see the Python documentation on Packaging and DistributingProjects.
Dot Notation¶
If an extension does not have a registered entry point, Python’s dot notation may be used instead. The extension mustbe installed as a Python module on your PYTHONPATH. Generally, a class should be specified in the name. The class mustbe at the end of the name and be separated by a colon from the module.
Therefore, if you were to import the class like this:
Then the extension can be loaded as follows:
You do not need to do anything special to support this feature. As long as your extension class is able to beimported, a user can include it with the above syntax.
The above two methods are especially useful if you need to implement a large number of extensions with more than oneresiding in a module. However, if you do not want to require that your users include the class name in their string,you must define only one extension per module and that module must contain a module-level function calledmakeExtension
that accepts **kwargs
and returns an extension instance.
For example:
When markdown.Markdown
is passed the “name” of your extension as a dot notation string that does not include a class(for example path.to.module
), it will import the module and call the makeExtension
function to initiate yourextension.
Registries¶
The markdown.util.Registry
class is a priority sorted registry which Markdown uses internally to determine theprocessing order of its various processors and patterns.
A Registry
instance provides two public methods to alter the data of the registry: register
and deregister
. Useregister
to add items and deregister
to remove items. See each method for specifics.
When registering an item, a “name” and a “priority” must be provided. All items are automatically sorted by the valueof the “priority” parameter such that the item with the highest value will be processed first. The “name” is used toremove (deregister
) and get items.
A Registry
instance is like a list (which maintains order) when reading data. You may iterate over the items, get anitem and get a count (length) of all items. You may also check that the registry contains an item.
When getting an item you may use either the index of the item or the string-based “name”. For example:
When checking that the registry contains an item, you may use either the string-based “name”, or a reference to theactual item. For example:
markdown.util.Registry
has the following methods:
Registry.register(self, item, name, priority)
¶
Add an item to the registry with the given name and priority.
Parameters:
item
: The item being registered.name
: A string used to reference the item.priority
: An integer or float used to sort against all items.
If an item is registered with a “name” which already exists, the existing item is replaced with the new item.Tread carefully as the old item is lost with no way to recover it. The new item will be sorted according to itspriority and will not retain the position of the old item.
Registry.deregister(self, name, strict=True)
¶
Remove an item from the registry.
Set strict=False
to fail silently.
Registry.get_index_for_name(self, name)
¶
- Return the index of the given
name
.
Article version: GitHub.com
Article version: GitHub.com
Learn the foundations for using the REST API, starting with authentication and some endpoint examples.
In this article
Let's walk through core API concepts as we tackle some everyday use cases.
Overview
Most applications will use an existing wrapper library in the languageof your choice, but it's important to familiarize yourself with the underlying APIHTTP methods first.
There's no easier way to kick the tires than through cURL. If you are usingan alternative client, note that you are required to send a validUser Agent header in your request.
Hello World
Let's start by testing our setup. Open up a command prompt and enter thefollowing command:

The response will be a random selection from our design philosophies.
Next, let's GET
Chris Wanstrath'sGitHub profile:
Mmmmm, tastes like JSON. Let's add the -i
flag to include headers:
There are a few interesting bits in the response headers. As expected, theContent-Type
is application/json
.
Any headers beginning with X-
are custom headers, and are not included in theHTTP spec. For example:
X-GitHub-Media-Type
has a value ofgithub.v3
. This lets us know the media typefor the response. Media types have helped us version our output in API v3. We'lltalk more about that later.- Take note of the
X-RateLimit-Limit
andX-RateLimit-Remaining
headers. Thispair of headers indicate how many requests a client can make ina rolling time period (typically an hour) and how many of those requests theclient has already spent.
Authentication
Unauthenticated clients can make 60 requests per hour. To get more requests per hour, we'll need toauthenticate. In fact, doing anything interesting with the GitHub API requiresauthentication.
Using personal access tokens
The easiest and best way to authenticate with the GitHub API is by using Basic Authentication via OAuth tokens. OAuth tokens include personal access tokens.
Use a -u
flag to set your username:
When prompted, you can enter your OAuth token, but we recommend you set up a variable for it:
You can use -u 'username:$token'
and set up a variable for token
to avoid leaving your token in shell history, which should be avoided.
When authenticating, you should see your rate limit bumped to 5,000 requests an hour, as indicated in the X-RateLimit-Limit
header. In addition to providing more calls per hour, authentication enables you to read and write private information using the API.
You can easily create a personal access token using your Personal access tokens settings page:
Get your own user profile
When properly authenticated, you can take advantage of the permissionsassociated with your GitHub account. For example, try gettingyour own user profile:
This time, in addition to the same set of public information weretrieved for @defunkt earlier, you should also see the non-public information for your user profile. For example, you'll see a plan
object in the response which gives details about the GitHub plan for the account.
Using OAuth tokens for apps
Apps that need to read or write private information using the API on behalf of another user should use OAuth.
OAuth uses tokens. Tokens provide two big features:
- Revokable access: users can revoke authorization to third party apps at any time
- Limited access: users can review the specific access that a tokenwill provide before authorizing a third party app
Github Markdown Api Examples
Tokens should be created via a web flow. An applicationsends users to GitHub to log in. GitHub then presents a dialogindicating the name of the app, as well as the level of access the apphas once it's authorized by the user. After a user authorizes access, GitHubredirects the user back to the application:
Treat OAuth tokens like passwords! Don't share them with other users or storethem in insecure places. The tokens in these examples are fake and the names havebeen changed to protect the innocent.
Now that we've got the hang of making authenticated calls, let's move along tothe Repositories API.
Repositories
Almost any meaningful use of the GitHub API will involve some level of Repositoryinformation. We can GET
repository details in the same way we fetched userdetails earlier:
In the same way, we can view repositories for the authenticated user:
Or, we can list repositories for another user:
Or, we can list repositories for an organization:
The information returned from these calls will depend on which scopes our token has when we authenticate:
- A token with
public_repo
scope returns a response that includes all public repositories we have access to see on github.com. - A token with
repo
scope returns a response that includes all public and private repositories we have access to see on GitHub.
As the docs indicate, these methods take a type
parameter thatcan filter the repositories returned based on what type of access the user hasfor the repository. In this way, we can fetch only directly-owned repositories,organization repositories, or repositories the user collaborates on via a team.
In this example, we grab only those repositories that octocat owns, not theones on which she collaborates. Note the quoted URL above. Depending on yourshell setup, cURL sometimes requires a quoted URL or else it ignores thequery string.
Create a repository
Fetching information for existing repositories is a common use case, but theGitHub API supports creating new repositories as well. To create a repository,we need to POST
some JSON containing the details and configuration options.
In this minimal example, we create a new private repository for our blog (to be servedon GitHub Pages, perhaps). Though the blog will be public, we've made the repository private. In this single step, we'll also initialize it with a README and a nanoc-flavored .gitignore template.
The resulting repository will be found at https://github.com/<your_username>/blog
.To create a repository under an organization for which you'rean owner, just change the API method from /user/repos
to /orgs/<org_name>/repos
.
Next, let's fetch our newly created repository:
Oh noes! Where did it go? Since we created the repository as private, we needto authenticate in order to see it. If you're a grizzled HTTP user, you mightexpect a 403
instead. Since we don't want to leak information about privaterepositories, the GitHub API returns a 404
in this case, as if to say 'we canneither confirm nor deny the existence of this repository.'
Issues
The UI for Issues on GitHub aims to provide 'just enough' workflow whilestaying out of your way. With the GitHub Issues API, you can pulldata out or create issues from other tools to create a workflow that works foryour team.
Just like github.com, the API provides a few methods to view issues for theauthenticated user. To see all your issues, call GET /issues
:
To get only the issues under one of your GitHub organizations, call GET /orgs/<org>/issues
:
We can also get all the issues under a single repository:
Pagination
A project the size of Rails has thousands of issues. We'll need to paginate,making multiple API calls to get the data. Let's repeat that last call, thistime taking note of the response headers:
The Link
header provides a way for a response to link toexternal resources, in this case additional pages of data. Since our call foundmore than thirty issues (the default page size), the API tells us where we canfind the next page and the last page of results.
Creating an issue
Now that we've seen how to paginate lists of issues, let's create an issue fromthe API.
To create an issue, we need to be authenticated, so we'll pass anOAuth token in the header. Also, we'll pass the title, body, and labels in the JSONbody to the /issues
path underneath the repository in which we want to createthe issue:
The response gives us a couple of pointers to the newly created issue, both inthe Location
response header and the url
field of the JSON response.
Conditional requests
A big part of being a good API citizen is respecting rate limits by caching information that hasn't changed. The API supports conditionalrequests and helps you do the right thing. Consider thefirst call we made to get defunkt's profile:
In addition to the JSON body, take note of the HTTP status code of 200
andthe ETag
header.The ETag is a fingerprint of the response. If we pass that on subsequent calls,we can tell the API to give us the resource again, only if it has changed:
The 304
status indicates that the resource hasn't changed since the last timewe asked for it and the response will contain no body. As a bonus, 304
responses don't count against your rate limit.
Woot! Now you know the basics of the GitHub API!
- Basic & OAuth authentication
- Fetching and creating repositories and issues
- Conditional requests
Keep learning with the next API guide Basics of Authentication!
