Tech Stuff - Survival Guide - ASN.1

This page is a survival guide to ASN.1 (ITU X.680) and DER (Distinguished Encoding Rules, ITU X.690) as it applies to X.509 (SSL) Certificates and other bits of the Public Key Infrastructure (PKIX). There is a lot, we're talking a lot, of stuff about both topics liberally scattered throughout the web. However, we found most of it rather confusing and in many cases incomplete - in the sense it did not answer our specific questions. We simply wanted, tactically, to write a codec for .der/.p12/.cer file interpretation and consequentially had to understand both the source ASN.1 and the encoding method. We did not want a Phd in either ASN.1 or DER. We went up a lot of blind alleys (our speciality) and pennies took a long time to drop. This page is the result, because we never want to do this again. Ever. If you find it useful - terrific, if not - add it to your list of confusing and/or incomplete ASN.1 material.

This page does not cover all of ASN.1 but should allow you to read most ASN.1 definitions and decode them from DER. The page uses worked examples of ASN.1 definitions taken from RFC 5912 (and others), which uses the 2002 ASN.1 Standard (X.680 07/2002) - freely available from the ITU, to illustrate the key elements covering most of the ASN.1 syntax. Each example is then shown in its DER encoded form. DER is defined in X.690 (07/2002) - again freely available from the ITU. If you need the whole ASN.1 shebang, in particlar, to write ASN.1 modules or definitions, there are plenty of books and web articles of varying usefulness scattered around. You could do no better, as a start, than Olivier Dubuisson's ASN.1 (June 5th, 2000) or John Larmouth's ASN.1 Complete (1999). Both books are generously made available at no cost for personal use. Both books pre-date the 2002 standards that modern (pre RFC 5912) RFCs are based on and therefore lack some essential material.

Bad News: You may also need to read and digest X.681, X.682 and X.683 - we were only joking when we just referenced X.680. But if you want to stay sane, stay away from X.691.

Really Bad News: There is no short cut to understanding this stuff. You have to know a surprising amount before it makes sense. Gird up your loins.

<gratuitous advice> ASN.1 is pretty complex and messy stuff. Drowning in detail is the overriding emotion at the beginning. The ASN.1 2002 standard upgrade (used in all but one of the worked examples) added significantly more features to an already complex syntax. The updated RFCs (starting with RFC 5912) take full advantage of the newer features to create tighter, less ambiguous definitions that ASN.1 parser/encoders just love to death. The downside is they are significantly more complex to read for the neophyte. The older RFCs, essentially anything prior to RFC 5912, typically use the ASN.1 1988 and 1993 standards (and even raw X.208) and are, in general, significantly easier, for the humble human, to read. The DER encoding, whether the ASN.1 source is 1988, 1993 or 2002, will be identical - the worked examples for ContentInfo use both the 1988 and 2002 standards to illustrate this point. There is a delicious irony in the fact that modern readers may love the pre-2002 ASN.1 source, modern ASN.1 parsers/encoders, however, may (will likely) choke on it.</gratuitous advice>

ASN.1 Overview

While traditionally associated with X.509 (SSL) certificates, ASN.1 is widely used in all forms of communication. ASN.1 is an implementation agnostic method for defining data structure that can be communicated unambiguously between heterogeneous systems using a variety of encoding schemes (for example, BER, CER, DER, PER etc.).

Let's start by getting familiar with ASN.1. The following is the ASN.1 fragment of an X.509 certificate from RFC 5912 Section 14 (PKIX1Implicit-2009):

 --
  -- Certificate structures begin here
  --

  Certificate  ::=  SIGNED{TBSCertificate}

  TBSCertificate  ::=  SEQUENCE  {
      version         [0]  Version DEFAULT v1,
      serialNumber         CertificateSerialNumber,
      signature            AlgorithmIdentifier{SIGNATURE-ALGORITHM,
                                {SignatureAlgorithms}},
      issuer               Name,
      validity             Validity,
      subject              Name,
      subjectPublicKeyInfo SubjectPublicKeyInfo,
      ... ,
      [[2:               -- If present, version MUST be v2
      issuerUniqueID  [1]  IMPLICIT UniqueIdentifier OPTIONAL,
      subjectUniqueID [2]  IMPLICIT UniqueIdentifier OPTIONAL
      ]],
      [[3:               -- If present, version MUST be v3 --
      extensions      [3]  Extensions{{CertExtensions}} OPTIONAL
      ]], ... }

  Version  ::=  INTEGER  {  v1(0), v2(1), v3(2)  }

  CertificateSerialNumber  ::=  INTEGER

  Validity ::= SEQUENCE {
      notBefore      Time,
      notAfter       Time  }

  Time ::= CHOICE {
      utcTime        UTCTime,
      generalTime    GeneralizedTime }

  UniqueIdentifier  ::=  BIT STRING

  SubjectPublicKeyInfo  ::=  SEQUENCE  {
      algorithm            AlgorithmIdentifier{PUBLIC-KEY,
                               {PublicKeyAlgorithms}},
      subjectPublicKey     BIT STRING  }

This illustrates most of the key features of ASN.1. Not important that you understand the contents at this point(!) but let's get a couple of basic things out of the way first.

ASN.1 Source Definitions

You can always decode any DER encoded stream without having access to the ASN.1 source and get clean results in the sense that the individual types can be separated and displayed in the format indicated by their type, for example, as an integer or a string.

However, the results, in most cases, are going to be pretty meaningless. Just what does this decoded integer signify or mean - though some intelligent guesswork and meta pattern recognition can help. This mode of operation, where the decoder has no access to the ASN.1 source, is sometimes referred to as schema-less. While the term schema has no meaning within ASN.1 (probably a collection of productions would be the closest ASN.1 terminology) it is frequently used to refer to snippets or fragments defining a particular entity taken fron an ASN.1 module. So, if the decoder has access to the ASN.1 source fragment it is frequently said to be schema-aware, if not it is said to be schema-less.

In general, the only way to validity interpret (as opposed to simply decode) the DER is to have access to the ASN.1 source either in its entirety or the relevant snippets/fragments (schema-aware) for the entity being decoded.

ASN.1 Comments

Since this page uses extensive commenting in ASN.1 definitions to explain things in snippets we'd better start by describing the commenting format. Comments in ASN.1 can take two forms. Single line comments begin with -- (dash dash) and may be terminated with a -- sequence or continue to the end of the line. Multi-line comments use the classic C style /* */ comment sequence:

-- this is single line comment taking the whole line
-- this is single line comment taking the whole line (terminated) --
blah -- this a comment appearing on a line with operational ASN.1
blah -- single line comment (terminated) in the middle of operational ASN.1 -- blah
/* multi-line comment starting on this line
(not extensively used but handy) 
and ends on this line */

Major breakthrough in understanding already? No. Didn't think so. But it's one thing out the way. Only 7,234 to go.

ASN.1 From 10,000 Feet - Syntax and Terminolgy

Like many ITU standards ASN.1 starts with the worthy goal of achieving maximum flexibility. This can have the result, IOHO, that important stuff is buried in section x.y.z meaning you have to read the whole specification(s) to get the stuff you want, which, to lazy devils like us, is anathama. We start as 10,000 feet guys trying to figure out what is going on, dropping down to millimeters (mixing our metrics to display virtuosity) only when required.

So, we start this survival guide by trying to define the basic structure and rules of ASN.1 (syntax if you prefer) using terminology that is not always used in the X.680 series (or rather is sometimes used in subsection x.y.z) to try and make it more, IOHO, comprehensible (we usually add the real terminolgy in parentheses). We may fail miserably in our goal. You will be the judge of that 'cos we understand this page (well, when we wrote it).

If you prefer to avoid our nauseous introductory stuff you can just jump to either the Summary or head for the worked examples.

Note: For consistency we use the loosy, goosey term element to refer to some definition which may consist of a number of parts, and we use the term item to refer to each part of the element - mostly. We also frequently use the term clause to refer to all the stuff (another technical term we use a lot) inside {} (curly brackets or braces). None of the terms we use appears in the ITU specs. They just seemed like a good idea (closer to what we understand) at the time to make sense of this stuff.

ASN.1 source is defined in ASN.1 Modules. Modules can import stuff from other modules so ASN.1 source can get real messy, real fast.

ASN.1 is very, we mean, very case sensitive.

Assignment (Definition/production)

The key ASN.1 element is the assignment (definition if you prefer, production is the ASN.1 official term) using the ::= operator (the assignment lexical item, X.680 Section 5.2 and 11.16). So anything with the ::= symbol is an assignment, the term we will use exclusively throughout this page. If you prefer definition or production you will have to mentally substitute it yourself. Here is a example assignment from the ASN.1 source above:

CertificateSerialNumber  ::=  INTEGER

Types (UserType, UniversalType)

ASN.1 uses the generic term Type almost everywhere. All types have a TypeName (typereference is the X.680 terminology). A TypeName always starts with a Capital letter (some are all CAPITALS). In the above example the left hand name creates what we term a UserType ('cos it's defined by the user) with a TypeName of CertificateSerialName and assigns it the single UniversalType of INTEGER (UniversalTypes are BuiltinTypes if you prefer). (It is a UserType because it starts with a Capital and does not appear in the Reserved/KEYWORDS list.) All TypeNames (UniversalType or UserType) have global scope - they can be referenced from anywhere in the ASN.1 module.

Many UniversalTypes are all CAPITALS, some, notably the string types are not, but they all start with a Capital, for example, VisibleString is a UniversalType, so it obeys the basic TypeName rule (it starts with a Capital). Given the above definition we can now use CertificateSerialNumber anywhere in our ASN.1 module and know it is assigned the UniversalType of INTEGER. (Full list of UniversalTypes.)

Like most languages ASN.1 has a list of Reserved/KEYWORDS. In the case of ASN.1 these comprise the UniversalTypes (a mixture of Capitals and CAPITALS) and a further set of KEYWORDS which are all CAPITALS.

Primitive and Constructed

Some UniversalTypes are simple, comprising a single item and are termed Primitives (like INTEGER) some are more complex and may consist of one or more elements and are termed Constructed. One of the most common Constructed, UniversalTypes is SEQUENCE and it would typically be defined like this (again from the initial sample):

Validity ::= SEQUENCE {

 -- notBefore and notAfter are elements in the SEQUENCE
 -- all the elements in a SEQUENCE are typically delimited by 
 -- braces {} (curly brackets)
 
      notBefore      Time,
      notAfter       Time  }

-- Note: The name Time above, starts with a Capital and is not
-- in the list of UniversalTypes or Reserved/KEYWORDS so is probably a
-- UserType (we will see later, it is)

-- ASN.1 is a free format language and the above SEQUENCE assignment
-- could have been written as

Validity ::= SEQUENCE  notBefore Time, notAfter Time
 
-- which is probably not as clear as the first version

Note that the left hand names notBefore and notAfter above do NOT start with a Capital. This lack of a Capital is highly significant and brings us to the next set of stuff. Be still my beating heart.

identifier

There are two cases where names starting with a lower case letter exist. The first is within braces (in a clause if you prefer the term) such as used with some Constructed, UniversalType (examples, SEQUENCE or SET, or the Reserved/KEYWORD CHOICE) where it is formally termed an identifier and has a scope only within the structure (within the enclosing braces of the clause). An identifier is defined as:

-- format used within braces {} (curly brackets)
{
-- identifiers do not use the assignment symbol
-- you can call then silent assignments if that appeals to your
-- sense of humour
 
identifier-name(WHITESPACE)Type-name[,]

-- where:
-- identifier-name starts with a lower case letter
-- (WHITESPACE) is one or more of space/VT/HT/LF/FF/CR characters
-- which means that the definition can occupy a single or multiple lines
-- [,] each item except the last is terminated with a comma
-- Type-name may be either a UniversalType or a UserType 
...
}
-- examples
{
-- UserType format
validity  Validity,

-- UniversalType format
number INTEGER,
...
}

Note: When used with the & symbol in CLASS definitions (we'll meet them later) the identifier appears to have scope outside the structure.

valueReference

The second case is a valueReference (we'll start it with a lower case to reflect its format) where it has global scope. A valueReference has a valueName and assigns a specific value to a named Type as opposed to just assigning a Type or Types. Here are two valueReference uses (both taken from RFC 5912):

-- from RFC 5912 page 87 (there is a typo in the RFC corrcet version is here
-- this simply assigns one valueReference to another (an alias)
-- the resolution of the right hand valueReference id-at-clearance-rfc3281 
-- will yield a value and a Type

   id-at-clearance ::= id-at-clearance-rfc3281 
	 
-- and id-at-clearance-rfc3281 has both a value and a Type

id-at-clearance-rfc3281              OBJECT IDENTIFIER ::= {
       joint-iso-ccitt(2) ds(5) module(1) selected-attribute-types(5)
       clearance (55) }
			 
-- this assigns a valueReference to an explicit value of a named Type
-- in this case the Type is the UniversalType OBJECT IDENTIFIER
-- we'll meet OBJECT IDENTIFIER in more detail later

-- and this one from the SubjectAltName worked example

   id-ce-subjectAltName OBJECT IDENTIFIER ::=  { id-ce 17 }
	 
-- where the id-ce part of {id-ce 17} is also a 
-- valueReference with this assignment (a value and a Type)

id-ce OBJECT IDENTIFIER  ::=  { joint-iso-ccitt(2) ds(5) 29 }

-- numbers may be written as namedNumbers as shown
-- or using the functionally identical number format 

id-ce OBJECT IDENTIFIER  ::=  { 2 5 29 }

-- the full value of id-ce-subjectAltName is therefore 2 5 29 17
-- or in typical OID terminology 2.5.29.17

ClassType

Our final (thank goodness) example is what the standard calls an Information Object (X.681) but we'll use the term ClassType for reasons that, hopefully, will become clear in a few seconds. ClassType must use a TypeName of all CAPITALS (X.681 7.1). Here is a ClassType assignment form RFC 5912 for the class EXTENSION (which we'll meet in detail later):

-- EXTENSION is the Information Object Type (ClassType) and the assignment
-- always uses the Reserved/KEYWORD CLASS
-- which is why we use the term ClassType

 EXTENSION ::= CLASS {
 
 -- ignore all this weird stuff it's covered in detail later
      &id  OBJECT IDENTIFIER UNIQUE,
      &ExtnType,
      &Critical    BOOLEAN DEFAULT {TRUE | FALSE }
  } WITH SYNTAX {
      SYNTAX &ExtnType IDENTIFIED BY &id
      [CRITICALITY &Critical]
  }
	
-- the class can then be referenced in an assignment such as this
-- which we'll meet later in detail
-- this is a valueReference form

   ext-SubjectAltName EXTENSION ::= { SYNTAX
       GeneralNames IDENTIFIED BY id-ce-subjectAltName }

-- this is a UserType form referencing a mythical class called THING

StupidType THING ::= {
 ... -- stuff inside assignment referencing the THING CLASS assignment
}

ASN.1 Syntax summary

The preceding section contained a superficial overview with lots of detail and nuance still to be covered. Rather than do this in isolation the next sections take some well know entities and deconstruct them (and DER encode them). However, at the 10,000 foot level we should be able to look at an ASN.1 definition or module and make some general observations about what is going on:

If a name starts with a Capital it defines either a UserType or a UniversalType or a ClassType or may be a Reserved/KEYWORD. Well, that's pretty helpful!
If a name is all CAPITALS it defines either a UniversalType or a ClassType or may be Reserved/KEYWORD but is never a UserType. One down 723 to go.
If a name is all CAPITALS and it's not in the list of UniversalType or Reserved/KEYWORD it is a ClassType and you need to search in the ASN.1 module for its definition (if you need to known more). Precision at last!
If a name starts with a Capital and ends with 'String' then it is most likely (!) a UniversalType with string properties, for example, PrintableString. This not an immutable rule. You can get caught so check the UniversalTypes. Caveats, caveats.
If a right hand name starts with a lower case letter it is a valueReference or an identifier depending on whether it's inside or outside braces. Say that again.

Sure did help a lot that summary.

ASN.1 Worked Examples

The following sections take an ASN.1 fragement defining a well known or interesting entity, explain any unique characteristics and then show how it is DER encoded.

ASN.1 and DER for X.509 Subset
ASN.1 and DER for ContentInfo - ASN.1 1988 Version
ASN.1 and DER for X.509 V3 Extension SubjectAltName
ASN.1 and DER for ContentInfo - ASN.1 2002 Version
ASN.1 and DER for X.509 (Full)

ASN.1 and DER for X.509 Subset

As a gentle (!) introduction we take a part of the X.509 ASN.1 source from the first sample (from RFC 5912 Section 14 (PKIX1Implicit-2009) and look at 2 elements and their definitions only, CertificateSerialNumber and Validity:

  -- this is the standard assignment (production) with some
  -- missing elements to keep thing nice and simple
  TBSCertificate  ::=  SEQUENCE  {
      
    ....   -- indicates missing elements to simplify  the example
      
    serialNumber         CertificateSerialNumber,
      
    ....   -- indicates missing elements to simplify  the example
      
    validity             Validity,
      
    ....   -- indicates missing elements to simplify  the example
    }
			
CertificateSerialNumber  ::=  INTEGER

  Validity ::= SEQUENCE {
      notBefore      Time,
      notAfter       Time  }

  Time ::= CHOICE {
      utcTime        UTCTime,
      generalTime    GeneralizedTime }

So, gird up your loins and let's get started. This is horrible stuff.

The first line is a production in ASN.1 terminology, we use the term assignment throughout this page in preference. An assignment may be completed on a single line or be enclosed in { } (braces or curly brackets) if it extends over more thn one line (call it a clause if that is more understandable, though the term is not used in ASN.1). The item (strictly a lexical item, X.680 5.2 - or token if you prefer) on the left of the assignment symbol is a Type and begins with an Capital letter and does not appear in the Reserved/KEYWORDS list (which includes the UniversalTypes). We refer to this throughout as a UserType ('cos the ASN.1 module writer (user) defined it). The item on the right is also a Type and in this case is SEQUENCE (one of the UniversalTypes) which defines an ordered sequence of elements. So our first assigment (type definition if you prefer) looks like this:


/* TBSCertificate is a new UserType we are defining (its name starts 
 with a Capital letter and is not in the Reserved/KEYWORD list). 
 It is defined on multiple lines (delimited by the braces {}) and consists of 
 a SEQUENCE (a UniversalType) defining an ordered
 list of elements
 (call it a SEQUENCE clause if that works for you)
 which we'll look at next
 (and this uses the unusual ASN.1 multi-line comment style)
 */ 
 
TBSCertificate  ::=  SEQUENCE  {  
    ....  -- placemarker indicates one or more elements are present
          -- but removed to keep it simple
    }

Now we'll add back some missing stuff from the SEQUENCE definition in this worked example:

TBSCertificate  ::=  SEQUENCE  {
      
    ....   -- indicates missing items to simplify  the example
      
    serialNumber         CertificateSerialNumber,
      
    ....   -- indicates missing items to simplify  the example
      
    validity             Validity,
      
    ....   -- indicates missing items to simplify  the example
    }

A SEQUENCE consists if an ordered list of elements of different types, in the example case we have 2 (serialNumber and validity). Each item within a SEQUENCE has a left hand identifier (an arbitrary, but usually meaningful, name not in the Reserved/KEYWORD list and starting with a lower case letter (X.680 11.3)). In the snippet above these are serialNumber and validity. The right hand names in both cases are UserTypes (start with a Capital and are not in the Reserved/KEYWORD list. In our terminology they're a UserType).

There is a common convention whereby the left hand name is the same as the right name only it starts with a Capital letter (because it's a UserType). This is only a convenience and as shown above is freguently broken (serialNumber). The scope of the identifier is local to the SEQUENCE (or its enclosing clause), All UserType names have global scope (within the ASN.1 module and using IMPORTS, a lot further).

Now, we know, if names are UserType they must have an assignment (a type definition) and these are shown below in the example ASN.1 source:


-- this is the standard X.509 Certificate assignment (production) with some
-- missing items to keep things nice and simple

  TBSCertificate  ::=  SEQUENCE  {
      
    ....   -- indicates missing items to simplify  the example
      
    serialNumber         CertificateSerialNumber,
      
    ....   -- indicates missing items to simplify  the example
      
    validity             Validity,
      
    ....   -- indicates missing items to simplify  the example
    }
			
CertificateSerialNumber  ::=  INTEGER

  Validity ::= SEQUENCE {
      notBefore      Time,
      notAfter       Time  }

  Time ::= CHOICE {
      utcTime        UTCTime,
      generalTime    GeneralizedTime }

The line starting with CertificateSerialNumber is the UserType assignment corresponding to the identifier serialNumber within the SEQUENCE assignment for TBSCertificate (TBSCertificate, if you are interested in these things, originally stood for ToBesSignedCertificate).

-- single line UserType assignment (or type definition) of INTEGER
-- INTEGER is one of the UniversalTypes

CertificateSerialNumber  ::=  INTEGER

-- CertificateSerialNumber has global scope so any reference to 
-- this TypeName will pickup an INTEGER assignment
-- if we did not require global scope we could have done this

TBSCertificate ::= SEQUENCE { -- start of SEQUENCE
  ... -- missing stuff
  serialNumber   INTEGER,
  ... -- more missing stuff
  }  -- termination of SEQUENCE

Validity is a bit more complex and its assignment uses another SEQUENCE (containing the two identifiers notBefore and notAfter each of which has a UserType of Time.

Validity ::= SEQUENCE {

    -- this SEQUENCE consists of two elements each of which 
    -- uses the same UserType with a TypeName of Time
		
    notBefore      Time,
    notAfter       Time  }
		
    -- Note: the position of the terminating brace can be on the 
    -- same line or a new line, ASN.1 is (pretty much) free format

The assigment (type definition) for the UserType Time consists of the Reserved/KEYWORD CHOICE (another multi-line structure) which indicates that only one element from the CHOICE can appear in the DER encoding but it can be any of those in the CHOICE clause. The format of elements within the CHOICE clause (the term clause is not used in X.680) is the same as for SEQUENCE as shown below:

Time ::= CHOICE {

-- each of the two elements (notBefore, notAfter) in Validity can consist of 
-- only one of the UniversalTypes UTCTime or GeneralizedTime
-- Thus, notBefore could use UCTTime and notAfter could use GeneralizedTime 
-- (or the opposite) or both could use the same UniversalType

      utcTime        UTCTime,
      generalTime    GeneralizedTime }

We've covered a reasonable amount of ground. Grab your favorite sleep-preventing beverage and let's now look at encoding this stuff into DER. You can lie down in a darkened room to recover after completing this section.

DER encoding of X.509 Subset

To provide a simple DER encoded example let's assume that TBSCertificate only consists of the two item we have looked at, the ASN.1 source we are going to encode is therefore:

-- this is the standard assignment (production) with some
  -- removed items to keep thing simple
	
  TBSCertificate  ::=  SEQUENCE  {
    -- we only use these two elements for simplicity
    serialNumber         CertificateSerialNumber,
    validity             Validity
  }
			
CertificateSerialNumber  ::=  INTEGER

  Validity ::= SEQUENCE {
      notBefore      Time,
      notAfter       Time  }

  Time ::= CHOICE {
      utcTime        UTCTime,
      generalTime    GeneralizedTime }

The above ASN.1 source will be DER encoded using the real values CertificateSerialNumber = 1 (complex choice) and notBefore = 10th, August 2017 (using UCTTime) and notAfter = 10th, August 2027 (using GeneralizedTime). Both will use the UCT time variant.

DER encoding is a classic TLV (Type, Length, Value) style encoding method defined in X.690. You need to be thoroughly familiar with binary and bit numbering to understand this stuff. The binary DER encoded string for the above ASN.1 will consist of a series of contiguous octets. They are shown hierarchically below to make them easier to visualize. We use ASN.1 style commenting even though it's not applicable in this context:

Warning for RFC readers: The ITU and IETF (RFCs) use radically different bit-numbering schemes.

-- DER encoding of SEQUENCE corresponding to
-- TBSCertificate  ::=  SEQUENCE 

301d

-- type (tag) is 16 (10 hex) = SEQUENCE with the Constructed 
-- bit set (20 hex), 
-- length = 29 (1d hex) which includes all the elements
-- in the SEQUENCE (3 + 13 + 13),
-- value is all the items in the SEQUENCE 


  -- DER encoding of CertificateSerialNumber = 1
 
  020101
	
  -- type (tag) is 2 = INTEGER, length is 1, value is 01
 
  -- DER encoding of notBefore 
 
  170b313730383130313030305a
	
  -- type (tag) is 17 = UCTTime, length is 11 (ob hex),
  -- value is binary encoding of 1708101000Z (10th August 2017, 10AM UCT)
 
 
  -- DER encoding of notAfter
	
  180b323032373038313031305a
	
  -- type (tag) is 18 = GeneralizedTime, length is 11 (ob hex),
  -- value is binary encoding of 2027081010Z (10th August, 2027, 10AM UCT)

-- complete DER bit string for this ASN.1
301d020101170b313730383130313030305a180b323032373038313031305a

And that's all there is to DER encoding of this highly artificial example.

PKCS ContentInfo - ASN.1 1988 Version

All PKCS containers (and CMS - RFC 5652) use the ContentInfo entity to identify the parts within the container. The ASN.1 is from RFC 5652 Section 3.

Note: This definition of ContentInfo uses the now obsolete ASN.1 1988 version. It is covered here because it introduces a number of concepts in a simpler fashion (and represents a gentler start) than the ASN.1 (2002) versions which supercede it and because you will still come across (frequently) the older ASN.1 notations. The resultant DER encoding for the 1988 and 2002 ASN.1 source will be identical. ASN.1 2002 ContentInfo worked example. (Warning: It uses a lot of terminology developed in this and the SubjectAltName worked examples.)

ContentInfo ::= SEQUENCE {
     contentType ContentType,
     content
       [0] EXPLICIT ANY DEFINED BY contentType OPTIONAL }

   ContentType ::= OBJECT IDENTIFIER

A deceptively short ASN.1 definition. Must be easy. Wrong.

ContentInfo is a UserType assignment and uses a SEQUENCE with two identifiers (contentType and content). contentType is nice and simple so let's get it out of the way:

-- ContentInfo UserType assignment
ContentInfo ::= SEQUENCE {

     -- the identifier contentType references the UserType ContentType

     contentType ContentType,
     .... -- omitted content for simplicity
     }
				
  -- ContentType (UserType) assignment references the 
  -- Universal type OBJECT DEFINITION (or OID)
  -- the defintion allows any OID (it has no assigned value)
	
   ContentType ::= OBJECT IDENTIFIER

If you are not familiar with OBJECT IDENTIFIER (a.k.a. OID) take the time to get a handle on them by following the link or reading this LDAP related OID page. They are important. We'll meet another OID example later which adds some quirks . Something to look forward to. This assignment simply says that ContentType can contain any valid OBJECT IDENTIFIER (OID) - if we are being picky it simply says any OBJECT IDENTIFIER and does not mandate a valid one.

Note: For LDAP users, ASN.1 OIDs can be confusing because ASN.1 uses a space separator between elements (subidentifiers in X.680 jargon) in the definitions of the OID whereas LDAP uses a dot. When DER encoding an OBJECT IDENTIFIER the separator is not present. It's the DER decoder that assembles the OID so it can use its locally preferred separator.

The second identifier (content) introduces a bunch of new concepts:

ContentInfo ::= SEQUENCE {
     ....  -- omitted content for simplicity
     
     -- the following constitutes a single element on two lines 
     -- ASN.1 is free format
     content
       [0] EXPLICIT ANY DEFINED BY contentType OPTIONAL }

Let's get the simple stuff out of the way again. OPTIONAL is a Reserved/KEYWORD that does what it says on the tin. It indicates the item may, or may not, be present in the encoded form. If the DER encoder is not supplied with content is does not encode anything. The receiver, however, has to figure out it's missing.

TaggedTypes

All the assignments we have seen so far use UniversalTypes (assuming sequential reading of the page - and no impatient jumping around!). However, UniversalTypes have potential limitations simply because they are generic. An INTEGER is an INTEGER, right? If anything apart from the use of a generic UniversalType is required then it can be indicated by using the TaggedType notation (X.680 30). This notation may take the form [x] to indicate a Context Specific class with tag value x, [APPLICATION x] to indicate an application class with tag value x and [PRIVATE x] to indicate a private class with a tag value of x. (Where x is a numeric value in the range 0 to 30.) All the TaggedTypes used in the RFC documentation are Context Specific exclusively. Context Specific simply means we are going to give the assigned Type an additional meaning over-and-above its raw Type meaning within (in the context of) the specific structure in which it appears. The TaggedType classes and their numeric value (tag number) are DER encoded into the bit string when we encounter then in the ASN.1 source.

Fascinating. But what do TaggedTypes do and when/why do we use these [x] formats?

Let's take a trivial assignment of a UserType (type definition):

StupidType ::= SEQUENCE {
      one        One,
      two        Two 
      }
One ::= INTEGER
Two ::= INTEGER

-- in the DER encoded form the receiver will always
-- get two INTEGERs One followed by Two and can differentiate 
-- between them based on order

Now, let's change the defintion to make both items OPTIONAL:


StupidType ::= SEQUENCE {
      one        One OPTIONAL,
      two        Two OPTIONAL
      }
One ::= INTEGER
Two ::= INTEGER

-- when this structure is DER encoded the receiver can get
-- two INTEGERs One followed by Two and can differentiate based on order 
-- zero INTEGERs in which case One and Two are oviously missing
-- one INTEGER in which case the receiver has no idea which one it 
-- is (One or Two?)

-- the problem is fixed by using a Context Specific syntax ([x]) as shown

StupidType ::= SEQUENCE {
      one      [0]  One OPTIONAL,
      two      [1]  Two OPTIONAL
			}
One ::= INTEGER
Two ::= INTEGER

-- when this structure is encoded One will DER encode a 
-- Context Specific type with a tag of 0 preceding the INTEGER and
-- Two will DER encode a Context Specific type with a tag of 1 
-- preceding the INTEGER
-- the receiver can unambiguously differentiate between any combination
-- of results

-- the encoding will be different depending in whether EXPLICIT or IMPLICIT
-- tagging is defined or defaulted (ASN.1 module defined default)

-- Reserved/KEYWORD IMPLICIT or EXPLICIT are optional depending on
-- the module default (follow links to read more)

StupidType ::= SEQUENCE {
      one      [0] EXPLICIT | IMPLICIT One OPTIONAL,
      two      [1] EXPLICIT | IMPLICIT Two OPTIONAL
    }

Now, let's consider a further example where we want to give some meaning to a UniversalType:

StupidType ::= SEQUENCE {
      version  [0]  Version,
      two           Two
      }
			
Version ::= INTEGER
Two     ::= INTEGER

-- in this case when the DER decoder receives a Context Specific tag of 0
-- it knows the INTEGER value is a Version and can handle it appropriately

Back to the original item:

content
       [0] EXPLICIT ANY DEFINED BY contentType

Almost finally(!), we have the word EXPLICIT which is a Reserved/KEYWORD and indicates that a TaggedType of Context Specific class with a tag value of 0 (and with the Constructed bit set) is added as a separate item during encoding before the item carrying the data. In the absence of any DEFINITIONS EXPLICIT TAGS | IMPLICIT TAGS in the ASN.1 module, EXPLICIT is the default and could have been omitted.

Note: The DER encoding for EXPLICIT and IMPLICIT are different. Many RFCs include separate ASN.1 source modules for IMPLICT and EXPLICIT defaults. DER Decoders, ideally, should be prepared to handle either version.

The final part is ANY DEFINED BY contentType. ANY DEFINED BY is an obsolete ASN.1 structure type that meant, well, anything goes defined by the OID of the item (contentType). This text is essentially syntax sugar. Useful to the human reader but invisible in the encoded DER. Maybe the ASN.1 parser or DER encoder finds this stuff useful. The net result of this definition is to tell the human reader that that content (value part) is determined by the OBJECT IDENTIFIER of contentType.

DER encoding of ContentInfo

Our ASN.1 source of ContentInfo is:

ContentInfo ::= SEQUENCE {
     contentType ContentType,
     content
       [0] EXPLICIT ANY DEFINED BY contentType OPTIONAL }

   ContentType ::= OBJECT IDENTIFIER

To give a specific meaning to the encoding we will use the values ContentType = 2 3 4 5 (or 2.3.4.5 in LDAP format) an entirely ficticious OBJECT IDENTIFIER and content ANY DEFINED BY - (recall that sugar stuff) - ContentType will be the IA5String 'wow'. A somewhat frivolous scenario, perhaps. ContentInfo used in a PKCS structure would typically indicate that content would be an X.509 certificate or some similarly gruesomely complex structure which even your mother would have difficulty encoding.

-- DER encoding of SEQUENCE corresponding to
- ContentInfo ::= SEQUENCE

300c

-- The Type/Tag is 16 (10 hex) with the 
-- Constructed bit set (20 + 10 = 30 hex),
-- length is 12 (0c hex) (5 + 2 + 5)(covers the whole sequence),
-- value is all the items in the SEQUENCE

  -- DER encoding of ContentType 

  0603530405

  -- The Type/Tag is 6 (06 hex) = OBJECT IDENTIFIER
  -- length is 3,
  -- value is the encoding of 2 3 4 5 (or 2.3.4.5 in LDAP format)
  -- (more detail of how this is DER encoded)

  -- DER encoding of content using IA5String

  -- but preceded by a Context Specific item with tag zero ([0] EXPLICIT)

  a005

  -- The Type/Tag is a0 (hex) consisting of
  -- Context Specific class (80 hex), contructed (20 hex)
  -- TaggedType = 0  
  -- length is 5 (covers the whole SEQUENCE),
  -- value is all the items in the SEQUENCE

-- DER encoding of content

  1603776f77

  -- The Type/Tag is 22 (16 hex) = IA5String
  -- length is 3,
  -- value is hex (ascii) of 'wow'

-- complete DER encoded bit string for this ASN.1 source
300c0603530405a0051603776f77

To echo the IA5String. wow.

X.509 V3 Extension SubjectAltName

The following ASN.1 source is taken from RFC 5912 Section 14 (PKIX1Implicit-2009) and defines the X.509 (SSL) certificate Version 3 (V3) Extension called SubjectAltName which allows an extended set of names covered by a X.509 certificate to be defined rather than the single name of the Subject attribute. The extension name is sometimes abbreviated to SAN.

-- subject alternative name extension OID and syntax

   ext-SubjectAltName EXTENSION ::= { SYNTAX
       GeneralNames IDENTIFIED BY id-ce-subjectAltName }
   id-ce-subjectAltName OBJECT IDENTIFIER ::=  { id-ce 17 }

   GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName

   GeneralName ::= CHOICE {
        otherName                   [0]  INSTANCE OF OTHER-NAME,
        rfc822Name                  [1]  IA5String,
        dNSName                     [2]  IA5String,
        x400Address                 [3]  ORAddress,
        directoryName               [4]  Name,
        ediPartyName                [5]  EDIPartyName,
        uniformResourceIdentifier   [6]  IA5String,
        iPAddress                   [7]  OCTET STRING,
        registeredID                [8]  OBJECT IDENTIFIER
   }

   
   OTHER-NAME ::= TYPE-IDENTIFIER

   EDIPartyName ::= SEQUENCE {
       nameAssigner    [0] DirectoryString {ubMax} OPTIONAL,
       partyName       [1] DirectoryString {ubMax}
   }

This is quite a complicated ASN.1 fragment so we'll take it one piece at a time, easy stuff first:


-- this is the assignment of the UserType GeneralNames used in a lot of places

GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName

-- first time we have seen a SEQUENCE assignment on a single line

-- defines a constraint using SIZE(min..max) 
--   min = 1 (must be at least 1) 
--   max = MAX (reserved/keyword)

-- Note the use of SEQUENCE ....OF when SIZE is present
-- and idicates the items must all be of the same type

-- So, The UserType GeneralNames is a SEQUENCE containing 1 
-- or more instances of UserType GeneralName

MAX, while a Reserved/KEYWORD, has no value defined in any X.680 standard. By default this means it's left to the implementor to fix an upper limit. The poor old receiver must be able to handle any number of GeneralName elements. The minimum value within SIZE can use the reserved word MIN which has the same characteristics as MAX - none. So much for standards.

Note: SIZE(min..max) can also be used with UniversalTypes, for example PrintableString or INTEGER where it can constrain (or define) limits.

Now the GeneralName assignment which has a couple of minor and one major quirk:

GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName

-- CHOICE indicates only one of the types below can be selected for 
-- each instance of GeneralName

GeneralName ::= CHOICE {
 otherName                 [0]  INSTANCE OF OTHER-NAME, -- QuirkyType!
 rfc822Name                [1]  IA5String,        -- UniversalType
 dNSName                   [2]  IA5String,        -- UniversalType
 x400Address               [3]  ORAddress,        -- UserType
 directoryName             [4]  Name,             -- UserType
 ediPartyName              [5]  EDIPartyName,     -- UserType
 uniformResourceIdentifier [6]  IA5String,        -- UniversalType
 iPAddress                 [7]  OCTET STRING,     -- UniversalType
 registeredID              [8]  OBJECT IDENTIFIER -- UniversalType
 }
 
-- all the elements use TaggedType IMPLICIT 
-- because the ASN.1 module from which this was taken has
-- DEFINITIONS IMPLICIT TAGS
-- the UserTypes ORAddress, Name will have an assignment 
-- you just have to find them!

-- EDIPartyName UserType assignment is below

EDIPartyName ::= SEQUENCE {
       nameAssigner    [0] DirectoryString {ubMax} OPTIONAL,
       partyName       [1] DirectoryString {ubMax}
       
   }

The assignment of EDIPartyName above uses Parameterization (defined in X.683) and a little bit of Constraint (defined in X.682). For the sake of completeness (you're excited, we can tell) let's follow this assignment to its gory conclusion.


EDIPartyName ::= SEQUENCE {
       nameAssigner    [0] DirectoryString {ubMax} OPTIONAL,
       partyName       [1] DirectoryString {ubMax}
     }
		 
-- {ubMax} is a parameter that will be passed as a parameter
-- to the DirectoryString (its parent) assignment
-- using the Parameterization feature defined in X.683

-- ubMax is asigned in the module as
-- (Note that ubMax is a valueReference)

ubMax INTEGER ::= 32768

-- DirectoryString is a UserType and its assignment is

DirectoryString{INTEGER:maxSize} ::= CHOICE {
			
      -- (INTEGER:maxSize} indicates that a parameter of UniversalType INTEGER
      -- is required for each invokation of DirectoryString 
      -- see EDIPartyName and ubMax above
      -- maxSize is a local variable that takes the value
      -- of the supplied parameter
      -- the net effect is that ubMax (value = 32768 (or 32K))
      -- is replaces maxSize on each of the elements below
      -- thus SIZE(1..maxSize) is, in this instance, SIZE(1..32768)
      -- by simply changing the value of ubMax we can change the limit
			
      teletexString    TeletexString(SIZE (1..maxSize)),
      printableString  PrintableString(SIZE (1..maxSize)),
      bmpString        BMPString(SIZE (1..maxSize)),
      universalString  UniversalString(SIZE (1..maxSize)),
      uTF8String       UTF8String(SIZE (1..maxSize))
			
  }

Well, that was pretty simple. We forgot what we were originally talking about before that nail-biting diversion:


GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName

-- CHOICE indicates only one of the types below can be selected for 
-- each instance

GeneralName ::= CHOICE {
  otherName                 [0]  INSTANCE OF OTHER-NAME, -- QuirkyType!
  rfc822Name                [1]  IA5String,        -- UniversalType
  dNSName                   [2]  IA5String,        -- UniversalType
  x400Address               [3]  ORAddress,        -- UserType
  directoryName             [4]  Name,             -- UserType
  ediPartyName              [5]  EDIPartyName,     -- UserType
  uniformResourceIdentifier [6]  IA5String,        -- UniversalType
  iPAddress                 [7]  OCTET STRING,     -- UniversalType
  registeredID              [8]  OBJECT IDENTIFIER -- UniversalType
 }

The result of these assignments is that GeneralNames will consist of 1 or more GeneralName UserTypes each of which may be encoded in only one of the ways defined by the CHOICE.

Note: The use of SEQUENCE ...OF in the above definition indicates that, strictly, the SEQUENCE must consist of items of the same type but GeneralName offers a CHOICE of different types. We broke the rules? No. The SEQUENCE consists only of the GeneralName Type, the assignment of GeneralName (which uses the CHOICE) is not visible at the GeneralNames assignment. Too subtle for us. Pass the pain medication.

Now, let's look at the use of the UniversalType INSTANCE OF in this fragment:

GeneralName ::= CHOICE {
  otherName      [0]  INSTANCE OF OTHER-NAME,
  -- elements omitted to keep simple
  }
	 
 OTHER-NAME ::= TYPE-IDENTIFIER
	 
-- INSTANCE OF is only used with
-- ClassTypes that use TYPE-IDENTIFIER (a builtin or canned ClassType)
-- and expands to the following SEQUENCE to capture the standard 
-- information (id and Type) used by TYPE-IDENTIFIER 

SEQUENCE {
-- & is an artifact of CLASS construction
-- OTHER-NAME.&id indicates that id is specific to the
-- OTHER-NAME invokation of TYPE-IDENTIFIER
  type-id OTHER-NAME.&id,
  value [0] OTHER-NAME.&Type
}

-- so, if the receiver gets a Context Specific class with a tag of 0 
-- (with the Constructed bit set ) for GeneralName 
-- this will be followed by the encoded SEQUENCE 
-- (see above) containing an OBJECT IDENTIFIER (id) followed 
-- by another Context Specific class (with tag 0) (and the constructed 
-- bit set) defining the content (Type) implicit in the 
-- OBJECT IDENTIFIER definition
-- NOTE: the above description assumes EXPLICIT 
-- (the default) IMPLICIT will encode differently

So all that remains from the original snippet for SubjectAltName is this pile of junk (advanced technical jargon):

ext-SubjectAltName EXTENSION ::= { SYNTAX
       GeneralNames IDENTIFIED BY id-ce-subjectAltName }
   id-ce-subjectAltName OBJECT IDENTIFIER ::=  { id-ce 17 }

Easy Bits (maybe) first:


 id-ce-subjectAltName OBJECT IDENTIFIER ::=  { id-ce 17 }

-- this is a valueReference assignment
-- it assigns a named Type (OBJECT IDENTIFIER) 
-- with a specific, in this case, an OBJECT IDENTIFIER (OID) value 
-- the value part is the OID definition (id-ce 17} of which
-- 17 is a specific value (a subidentifier X.690 8.19.2) in the OID  
-- and id-ce is another valueReference with this assignment

id-ce OBJECT IDENTIFIER  ::=  { joint-iso-ccitt(2) ds(5) 29 }

-- so by combining the two valueReferences we figure that
-- id-ce-subjectAltName is an OBJECT IDENTIFIER type with the 
-- assigned value of 2 5 29 17 (or 2.5.29.17 in LDAP format)

-- Note: When working with OIDs two formats are available
-- a namedNumber format, for example, { joint-iso-ccitt(2) ds(5) 29 } 
-- where values are in parentheses following the name (the name is essentially 
-- syntax sugar) and which can be a useful indicator of OID delegation, or 
-- it could have been written using a simple number format {2 5 29}

We are left with this assignment, which is another valueReference:

ext-SubjectAltName EXTENSION ::= { SYNTAX
       GeneralNames IDENTIFIED BY id-ce-subjectAltName }

-- ext-SubjectAltName is the valueReference assigning a value

{ SYNTAX
       GeneralNames IDENTIFIED BY id-ce-subjectAltName }

-- to the ClassType EXTENSION  (TypeName is CAPITALS
-- and it's not in the Reserved/KEYWORDS list)

-- the value element can only be understood in the context of 
-- the EXTENSION ClassType assignment, so we'll loook at that next

The ClassType assignment for EXTENSION is picked up from a common module PKIX-CommonTypes-2009 using IMPORTS in the ASN.1 source module and results in the ClassType assigment shown (from RFC 5912 Section 2):

Note: The sections below are a quick fly-over of what we call ClassTypes and what are formally termed an Information Object Class. If you need all the gory details then you need to read X.681. And good luck to you.

  -- EXTENSION ClassType assigment using CLASS Keyword
	
  EXTENSION ::= CLASS {
      
      -- & notation is an artifact of Class construction (X.681 7.4 and 7.5)
      -- id is the OBJECT IDENTIFIER of the extension (an identifier
      -- a.k.a. a silently assigned valueReference if your sense of 
      -- humor (or humour) streches that far)
      -- ExtnType is a UserType defining the extension structure
      -- Critical is a UniversalType indicating the, optional, crticality 
      -- of the extension
			
      &id  OBJECT IDENTIFIER UNIQUE,
      &ExtnType,
      -- NOTE: because no Type is defined above this is generically referred 
      -- to as an Open Type (more jargon)
			
      &Critical    BOOLEAN DEFAULT {TRUE | FALSE }
      -- confusing definition 
      -- DEFAULT indicates there will be a default but no value is defined
      -- {TRUE | FALSE} defines the permissable values nor the default value
  } 

-- the WITH SYNTAX clause allows the designer of the ClassType 
-- to define a user-friendly(!) format (within braces) that will be 
-- used when this ClassType is used in an assignment. Literals
-- help the human reader and play no other role
 
	WITH SYNTAX {
	
      -- SYNTAX, IDENTIFIED BY and CRITICALITY are LITERALS
      -- which the class designer chose as hints for the user
      -- they must be CAPITALS and can be Reserved/KEYWORDS 
      -- including UniversalTypes just to confuse we mortals
      -- (there is a modest exclusion list in X.681 10.6)
      -- Literals are useful to the human but irrelevant to DER encoding
			
      SYNTAX &ExtnType IDENTIFIED BY &id
			
      -- the square brackets indicate this is optional 
			
      [CRITICALITY &Critical]
  }
	
  -- an assignment using ClassType EXTENSION could look something like
	
  StupidExtension EXTENSION ::= {
	SYNTAX Ext-Structure DEFINED BY ext-oid
	CRITICAL TRUE }

  -- note the use of upper and lower case is significant
  -- Ext-Structure will be a UserType and 
  -- ext-oid a valueReference defining the OID

While this defines EXTENSION it's vital to look at a couple of other uses of the EXTENSION ClassType to understand what is going on. Here they are:


-- some comments removed and reformatted to remove a line break
-- the EXTENSION. notation indicates the referenced field
-- (for example &id) relates to this ClassType

  Extension{EXTENSION:ExtensionSet} ::= SEQUENCE {
      extnID      EXTENSION.&id({ExtensionSet}),
      critical    BOOLEAN  DEFAULT FALSE,
      -- this is where the DEFAULT value is assigned
      extnValue   OCTET STRING (CONTAINING
                  EXTENSION.&ExtnType({ExtensionSet}{@extnID}))
                  --  contains the DER encoding of the ASN.1 value
                  --  corresponding to the extension type identified
                  --  by extnID
  }
  -- VITAL: the contents of all V3 extensions (this assignment is used by all V3
  -- Extensions) are encapsulated in an OCTET STRING
  -- why? excellent question
  -- EXTENSION.&id({ExtensionSet}) limits permissable values (somewhat)
	-- EXTENSION.&ExtnType({ExtensionSet} limits the number of items
	-- {@extnID} is a component reference to the OID
	
	Extensions{EXTENSION:ExtensionSet} ::=
      SEQUENCE SIZE (1..MAX) OF Extension{{ExtensionSet}}

-- Extensions is a UserType assignment and uses parameterization
-- {EXTENSIONS:ExtensionSet} indicates a parameter of ClassType 
-- EXTENSIONS must be supplied when Extentsions is referenced 
-- and limits the number of extensions
-- ExtensionSet is a local variable that is substituted by the supplied parameter
-- Extensions is used in multiple places - this one is from the TBSCertificate 
-- assignment

TBSCertificate ::= SEQUENCE {
   .... -- omitted elements
   extensions      [3]  Extensions{{CertExtensions}} OPTIONAL
	 
	 -- CertExtensions is a UserType supplied as a parameter (X.683)
   -- to Extensions
   .... -- omitted elements
   }
  
-- CertExtensions UserType assignment using ClassType EXTENSION
-- as required in the parameter for Extensions

CertExtensions EXTENSION ::= {
           ext-AuthorityKeyIdentifier | ext-SubjectKeyIdentifier |
           ext-KeyUsage | ext-PrivateKeyUsagePeriod |
           ext-CertificatePolicies | ext-PolicyMappings |
           ext-SubjectAltName | ext-IssuerAltName |
           ext-SubjectDirectoryAttributes |
           ext-BasicConstraints | ext-NameConstraints |
           ext-PolicyConstraints | ext-ExtKeyUsage |
           ext-CRLDistributionPoints | ext-InhibitAnyPolicy |
           ext-FreshestCRL | ext-AuthorityInfoAccess |
           ext-SubjectInfoAccessSyntax, ... }

-- an extensible list including all extensions defined in 
-- this ASN.1 module
-- extensibility indicated by ... (ellipsis)
-- Note: these are all valueReferences

So we wrap this up by looking at the user-friendly (defined by WITH SYTAX) part again and we see:

ext-SubjectAltName EXTENSION ::= { SYNTAX
       GeneralNames IDENTIFIED BY id-ce-subjectAltName }

-- we know SYNTAX and IDENTIFIED BY are Literals
-- so the valueReference assignment using the ClassType EXTENSION
-- consists of GeneralNames ( a SEQUENCE of 1 or more GeneralName
-- each of which can use a different format (via the CHOICE))
-- and the EXTENSION is identified by the OID 
-- id-ce-subjectAltName (2 5 29 17)

-- the optional Literal CRITICAL and its BOOLEAN value is omitted from this 
-- assignment because SubjectAltName is not (most of the time) a critical 
-- extension

We're done? Not so fast.

We postponed this gem until the end, partly out of a sense of drama and partly because it needs a lot of stuff (knowledge) before it makes any sense. Truth be told, it actually makes little sense even when you know the stuff. But it is as clear as mud on the third reading.

The GeneralName assignment includes the apparently innocuous line directoryName Name,. This is the dreaded DN (directoryName) with all that cn=blah, O=blah, blah stuff and it turns into out to be a pretty complex little definition (most of which you now have the knowledge to interpret):


-- we go down a lot of rabbit holes before we figure this lot out

Name ::= CHOICE { -- only one possibility for now --
      rdnSequence  RDNSequence }

  RDNSequence ::= SEQUENCE OF RelativeDistinguishedName

  DistinguishedName ::=   RDNSequence

  RelativeDistinguishedName  ::=
      SET SIZE (1 .. MAX) OF SingleAttribute { {SupportedAttributes} }

  --  These are the known name elements for a DN

  SupportedAttributes ATTRIBUTE ::= {
      at-name | at-surname | at-givenName | at-initials |
      at-generationQualifier | at-x520CommonName |
      at-x520LocalityName | at-x520StateOrProvinceName |
      at-x520OrganizationName | at-x520OrganizationalUnitName |
      at-x520Title | at-x520dnQualifier | at-x520countryName |
      at-x520SerialNumber | at-x520Pseudonym | at-domainComponent |
      at-emailAddress, ... }

SingleAttribute{ATTRIBUTE:AttrSet} ::= SEQUENCE {
      type      ATTRIBUTE.&id({AttrSet}),
      value     ATTRIBUTE.&Type({AttrSet}{@type})
  }

-- MATCHING-RULE is another ClassType
-- not shown here to keep things simple(!)

ATTRIBUTE ::= CLASS {
      &id             OBJECT IDENTIFIER UNIQUE,
      &Type           OPTIONAL,
      &equality-match MATCHING-RULE OPTIONAL,
      &minCount       INTEGER DEFAULT 1,
      &maxCount       INTEGER OPTIONAL
  } WITH SYNTAX {
      [TYPE &Type]
      [EQUALITY MATCHING RULE &equality-match]
      [COUNTS [MIN &minCount] [MAX &maxCount]]
      IDENTIFIED BY &id
  }

DER Encoding of SubjectAltName

SubjectAltName is a potentially very complex X.509 Certificate V3 Extension (though a long way from the most complex). We'll keep it as simple as sensible. We know from its assigment that it has an OID of 2 5 29 17, is not, generally, a critical extension (unless the subject attribute of the X.509 certificate in which it appears is empty (has a length = 0) in which case it must be marked critical) and its contents are of UserType GeneralNames. We will provide two GeneralNames (strictly for the CHOICE at GeneralName) one directoryName - a UserType of Name (a DN with the value cn=me, o=my) and one of dNSName - a UniversalType of IA5String (with the value example.com). The encoding is shown hierarchically and then as an assembled string. We use the ASN.1 commenting style even though it's not relevant in this case.

The ASN.1 source we are using is this (relevant parts for this example only):


ext-SubjectAltName EXTENSION ::= { SYNTAX
       GeneralNames IDENTIFIED BY id-ce-subjectAltName }
   id-ce-subjectAltName OBJECT IDENTIFIER ::=  { id-ce 17 }
	 
GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName

   GeneralName ::= CHOICE {
        otherName                   [0]  INSTANCE OF OTHER-NAME,
        rfc822Name                  [1]  IA5String,
        dNSName                     [2]  IA5String,
        x400Address                 [3]  ORAddress,
        directoryName               [4]  Name,
        ediPartyName                [5]  EDIPartyName,
        uniformResourceIdentifier   [6]  IA5String,
        iPAddress                   [7]  OCTET STRING,
        registeredID                [8]  OBJECT IDENTIFIER
   }
	 
   Extension{EXTENSION:ExtensionSet} ::= SEQUENCE {
      extnID      EXTENSION.&id({ExtensionSet}),
      critical    BOOLEAN  DEFAULT FALSE,
      extnValue   OCTET STRING (CONTAINING
                  EXTENSION.&ExtnType({ExtensionSet}{@extnID}))
                  --  contains the DER encoding of the ASN.1 value
                  --  corresponding to the extension type identified
                  --  by extnID
  }
	
  Name ::= CHOICE { -- only one possibility for now --
      rdnSequence  RDNSequence }

  RDNSequence ::= SEQUENCE OF RelativeDistinguishedName

  DistinguishedName ::=   RDNSequence

  RelativeDistinguishedName  ::=
      SET SIZE (1 .. MAX) OF SingleAttribute { {SupportedAttributes} }

  --  These are the known name elements for a DN

  SupportedAttributes ATTRIBUTE ::= {
      at-name | at-surname | at-givenName | at-initials |
      at-generationQualifier | at-x520CommonName |
      at-x520LocalityName | at-x520StateOrProvinceName |
      at-x520OrganizationName | at-x520OrganizationalUnitName |
      at-x520Title | at-x520dnQualifier | at-x520countryName |
      at-x520SerialNumber | at-x520Pseudonym | at-domainComponent |
      at-emailAddress, ... }
			
  SingleAttribute{ATTRIBUTE:AttrSet} ::= SEQUENCE {
      type      ATTRIBUTE.&id({AttrSet}),
      value     ATTRIBUTE.&Type({AttrSet}{@type})
  }
	
ATTRIBUTE ::= CLASS {
-- encoding order taken from class body not WITH SYNTAX
      &id             OBJECT IDENTIFIER UNIQUE,
      &Type           OPTIONAL,
      &equality-match MATCHING-RULE OPTIONAL,
      &minCount       INTEGER DEFAULT 1,
      &maxCount       INTEGER OPTIONAL
  } WITH SYNTAX {
      [TYPE &Type]
      [EQUALITY MATCHING RULE &equality-match]
      [COUNTS [MIN &minCount] [MAX &maxCount]]
      -- only OID is mandatory
      IDENTIFIED BY &id
  }	
	
DirectoryString{INTEGER:maxSize} ::= CHOICE {
      teletexString    TeletexString(SIZE (1..maxSize)),
      printableString  PrintableString(SIZE (1..maxSize)),
      bmpString        BMPString(SIZE (1..maxSize)),
      universalString  UniversalString(SIZE (1..maxSize)),
      uTF8String       UTF8String(SIZE (1..maxSize))
  }

This is quite long-winded and you may end up not seeing the wood for the trees (pretty stupid idiom when you think about it, which we never do).

-- each extensions is wrapped in a sequence to delimit it 
-- it's triggered by the Extension assignment
-- Extension{EXTENSION:ExtensionSet} ::= SEQUENCE 

-- DER encoding of EXTENSION wrapper SEQUENCE

3042
-- type (tag) is 16 (10 hex) = SEQUENCE with the Constructed 
-- bit set (20 hex, 10 | 20 = 30 hex), 
-- length = 66 (42 hex) which includes all the elements in the SEQUENCE (5 + 61),
-- value is all the items in the SEQUENCE 

  -- DER encoding of id (OID = 2.5.29.17)
  -- id-ce-subjectAltName OBJECT IDENTIFIER ::=  { id-ce 17 }

  0603551d11

  -- The Type/Tag is 6 (06 hex) = OBJECT IDENTIFIER
  -- length is 3,
  -- value is the encoding of 2 5 29 17 (or 2.5.29.17 in LDAP format)
  -- (more detail of how this is DER encoded)

  -- Critical (a BOOLEAN) would be included here if TRUE 
  -- (SEQUENCE is ordered)
  -- it's FALSE (the DEFAULT) for the example so we can omit it

  -- DER encoding of GeneralNames starts with a SEQUENCE
  -- GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName
  -- SIZE and constraints not visible in encoding

  303b
	
  -- type (tag) is 16 (10 hex) = SEQUENCE with the Constructed
  -- bit set (20 hex, 10 | 20 = 30 hex), 
  -- length = 59 (3b hex) which includes all the elements in the SEQUENCE (13 + 46),
  -- value is all the items in the SEQUENCE 

    -- DER encoding of dNSName = example.com

    160b6578616d706c652e636f6d
		
    -- The Type/Tag is 22 (16 hex) = IA5String
    -- length is 11 (0b hex),
    -- value is hex (ascii) of 'example.com'

    -- DER encoding of directoryName (DN) cn=me, o=my
    -- RDNSequence ::= SEQUENCE OF RelativeDistinguishedName
    -- this is a SEQUENCE of RDNs RDN = cn=me, RDN = o=my

    301a
		
    -- type (tag) is 16 (10 hex) = SEQUENCE with the Constructed
    -- bit set (20 hex, 10 | 20 = 30 hex), 
    -- length = 29 (1d hex) which includes all the elements in the 
    -- SEQUENCE (13 + 13),
    -- value is all the items in the SEQUENCE 

      -- each RDN starts with a SET
      -- RelativeDistinguishedName  ::=
      --      SET SIZE (1 .. MAX) OF SingleAttribute { {SupportedAttributes} }

      -- DER encoding of RDN cn=my
      -- cn (commonName) has the OID 2 5 4 3
      -- starting SET 
			
      310b
			
      -- type (tag) is 17 (11 hex) = SET with the Constructed
      -- bit set (20 hex, 11 | 20 = 31 hex), 
      -- length = 11 (0b hex) which includes all the elements in the SET (11),
      -- value is all the items in the SET 

        -- followed by a SEQUENCE covering the items in the attribute (cn)
				
        3009
				
        -- type (tag) is 16 (10 hex) = SEQUENCE with the Constructed
        -- bit set (20 hex, 10 | 20 = 30 hex), 
        -- length = 29 (1d hex) which includes all the elements in 
        -- the SEQUENCE (5 + 4),
        -- value is all the items in the SEQUENCE 

        -- DER encoding of cn(commonName) OID 2 5 4 3
				
        0603550403
				
        -- The Type/Tag is 6 (06 hex) = OBJECT IDENTIFIER
        -- length is 3,
        -- value is the encoding of 2 5 4 3 (or 2.5.4.3 in LDAP format)
        -- (more detail of how this is DER encoded)

        -- DER encoding of me (cn=me)

        13026d65
				
        -- The Type/Tag is 19 (13 hex) = PrintableString (one of the CHOICE 
        -- for DirectoryString) 
        -- length is 11 (0b hex),
        -- value is hex (ascii) of 'me'

      -- Rinse and repeat for RDN o=my gives (o = organization OID = 2 5 4 10)
      -- starts at RDN SET definition
			
      310b3009060355040a13026d79

-- complete DER octet string
30420603551d11303b160b6578616d706c652e636f6d301a310b3009060355040
313026d65310b3009060355040a13026d79

PKCS ContentInfo - ANS.1 2002 Version

This worked example uses the ASN.1 (2002) definition of ContentInfo a basic structure used in most PKCS structures and CMS. (The simpler ASN.1 (1988) Version of ContentInfo is also a worked example.) Snippet from RFC 6268 section 9.

Following the previous example there is little new in this snippet even though it is significantly longer than its 1988 counterpart (and also covers a tad more ground). Comments added to try and explain the more obscure syntax:


-- this is funcionally identical to TYPE-IDENTIFIER
-- other than the user-friendly(!) WITH SYNTAX part
-- that allows an optional [TYPE &Type] part

CONTENT-TYPE ::= CLASS {
     &id        OBJECT IDENTIFIER UNIQUE,
     &Type      OPTIONAL
   } WITH SYNTAX {
       [TYPE &Type] IDENTIFIED BY &id
   }

   -- any use of UserType ContentType
   -- will use id (&id) from a CONTENT-TYPE class not, say, TYPE-IDENTIFIER
   -- which uses the same identifier
	 
   ContentType ::= CONTENT-TYPE.&id

   ContentInfo ::= SEQUENCE {
     contentType        CONTENT-TYPE.
                     &id({ContentSet}),
     content            [0] EXPLICIT CONTENT-TYPE.
                     &Type({ContentSet}{@contentType})}

   ContentSet CONTENT-TYPE ::= {
     --  Define the set of content types to be recognized.
     -- these are all valueReference to OID values
     ct-Data | ct-SignedData | ct-EncryptedData | ct-EnvelopedData |
     ct-AuthenticatedData | ct-DigestedData, ... }

-- CONTENT-TYPE user-friendly(!) format with optional [TYPE &Type] omitted

ct-Data CONTENT-TYPE ::= { IDENTIFIED BY id-data }

-- the OID value referenced above

id-data OBJECT IDENTIFIER ::= { iso(1) member-body(2)
     us(840) rsadsi(113549) pkcs(1) pkcs7(7) 1 }

The only difference between this definition and the 1988 version is that this provides some limits on the OID that can be used whereas the 1988 version does not.

DER for ContentInfo (2002)

The DER generated from ASN.1 (2002) ContentInfo is identical to that produced by the ANS.1 1988 version. The only difference would be that the OID value used in the 1988 example would be rejected by an ASN.1 (2002) parser/encoder since it does not lie in the permitted range (introduced in the ASN.1 2002 source upgrade). However, since the example used a fictitious OID to illustrate the general principal of encoding OIDs it is still relevant.

ASN.1 and DER for X.509 (Full)

So we return, finally, to the initial ASN.1 example we started with which should all now be pretty understandable with the exception of one gruesomely ugly piece of syntax. Here, again, is the whole ASN.1 fragment:

Certificate  ::=  SIGNED{TBSCertificate}

  TBSCertificate  ::=  SEQUENCE  {
      version         [0]  Version DEFAULT v1,
      serialNumber         CertificateSerialNumber,
      signature            AlgorithmIdentifier{SIGNATURE-ALGORITHM,
                                {SignatureAlgorithms}},
      issuer               Name,
      validity             Validity,
      subject              Name,
      subjectPublicKeyInfo SubjectPublicKeyInfo,
      ... ,
      [[2:               -- If present, version MUST be v2
      issuerUniqueID  [1]  IMPLICIT UniqueIdentifier OPTIONAL,
      subjectUniqueID [2]  IMPLICIT UniqueIdentifier OPTIONAL
      ]],
      [[3:               -- If present, version MUST be v3 --
      extensions      [3]  Extensions{{CertExtensions}} OPTIONAL
      ]], ... }

The double square bracket notation delimits what is called an extension addition group. [[ starts the group definition and ]] terminates it. There are two of them in the above certificate. Both extension addition group use the optional version number feature (the 2: and 3: values after the opening [[ delimiter). (Note: The extension part of the ASN.1 term extension addition group has nothing to do with Extension, EXTENSION or even extensions used in X.509 certificate ASN.1 source definitions - simply an unhappy co-incidence.)

Modern ASN.1 parsers love this stuff. The effect on the DER encoding of this additional syntax (over its 1988 predecessor) is - nada, nothing. (There are, however, very good reasons to use the enhanced syntax.) The only lines that affect the decoding are effectively:


-- these are all optional elements

 issuerUniqueID  [1]  IMPLICIT UniqueIdentifier OPTIONAL,
                          -- If present, version MUST be v2 or v3
 subjectUniqueID [2]  IMPLICIT UniqueIdentifier OPTIONAL,
                          -- If present, version MUST be v2 or v3
													
 extensions      [3]  Extensions OPTIONAL
                          -- If present, version MUST be v3 --  }
													
  -- the effect of the definition above is to encapsulate all 
  -- X.509 V3 Extensions (such as SubjectAltName)
  -- in the following DER encoded wrapper
	
  a3xx

  -- a3 (hex) is the encoding of [3] EXPLICIT (EXPLICIT is the default for this module
   -- and has been omitted)
  -- type (tag) has the Context Specific 
  -- class encoded (80 hex), the Constructed bit set (20 hex) and the 
  -- TaggedType number = 3 (03 hex) 
  -- length = xx (the sum of 
  -- all the V3 Extensions present)
  -- value is all the V3 Extensions present
  -- if no V3 Extension are present in the cert then, since the element is optional 
  -- (meaning the receiver can recognize its absence), the wrapper 
  -- (a3xx above) is not present
	
  -- each V3 Extenion in turn is wrapped in a standard sequence starting
	
  30yy

DER for X.509 Certificate (Full)

To show all the X.509 Certifice elements being encoded would take us more than a life time to write and the only outcome for the reader may be the risk of death by boredom (as if this page is not big enough). Many of the individual fields have already beeen covered (in horrible detail) in previous worked examples, and the more interesting points were covered within the text above. ASN.1 definitions were taken from RFC 5912 but readers are again reminded that the ASN.1 of earlier versions (RFC 5280 or earlier), while detested by modern ASN.1 parsers/DER encoders, is (possibly) easier for we, mere mortals, to understand. The DER encoding produced by any ASN.1 version is identical. There are a lot of certifcates in daily use that were produced a long (long) time ago.

And if you feel cheated out of a DER encodng example, dear reader, we may have saved your life.

ASN.1 RESERVED/KEYWORDS

This is an incomplete list of the Reserved/KEYWORDS used in ASN.1. The Universal Type names are also included in the Reserved/KEYWORD list. (Full list X.680 11.27)

Reserved/KEYWORD	Use/Meaning
ANY DEFINED BY	ANY DEFINED BY is an obsolete (ANS.1 1988/X.208) set of Reserved/KEYWORDS and Literals. They may have been used by the parser/DER encoder but do not appear in any DER encoded bit stream. It is syntax sugar - great for humans but not much else. Typically, replaced by IDENTIFIED BY (in ASN.1, 2002) as illustrated in the TYPE-IDENTIFIER Information Object.
APPLICATION	Only used with TaggedTypes ([x]). Indicates the tag is used by some application and is usage is defined by that application. When the implementor should use APPLICATION and when PRIVATE remains, to us, a mystery.
BIT	Only appears in the context of BIT STRING (a UniversalType).
BY	Syntax sugar. Appears in conjunction with other words most commonly IDENTIFIED BY (X.681) or ENCODED BY or CONSTRAINED BY. Maybe useful to the ASN.1 parser/DER encoder and great for humans but is not visible in the DER encoding.
CHOICE	Precedes a list of elements and indicates that only one type of the choices available will be present in the DER encoded form. CHOICE itself does not appear in the DER encoded form. CHOICE is delimited by {...} (braces or curly brackets). Example: Time ::= CHOICE { utcTime UTCTime, generalTime GeneralizedTime }
CLASS	This keyword is used in the assignment of what we call a ClassType and the standards call an Information Object class (X.681). For usage in an assignment see TYPE-IDENTIFIER.
CONTAINING	Indicates that a Type (UniversalType or UserType) may be constrained (X.682 11) to contain only another Type (UniversalType or UserType).
DEFAULT	Indicates a default value for the Type assignment. If used with a UserType or a UniversalType the Reserved/KEYWORD OPTIONAL would normally also be present. When used with a TaggedType OPTIONAL is implicit. Example: TBSCertificate ::= SEQUENCE { version [0] Version DEFAULT v1, -- TaggedType (Context Specific) with tag value = 0 -- Version is a UserType -- DEFAULT value is v1 (a valueReference) -- it could have been an explicit value such as 0 or 537 -- elements omitted for clarity } Version ::= INTEGER { v1(0), v2(1), v3(2) } -- UserType Version assigment defines a range of -- permissable values including v1 (a namedNumber) which -- has the numeric value 0
EXPLICIT	Used in conjunction with [x] syntax (TaggedType, X.680 30) defining Context Specific, APPLICATION or PRIVATE class types and indicates that an item of the defined class, containing the defined tag value x, will be DER encoded and prepended (with the Constructed bit set) to the UniversalType of the element. The default for TaggedTypes is defined for the ASN.1 module using DEFINITIONS IMPLICIT TAGS \| EXPLICIT TAGS. In the absence of any such directive EXPLICIT is the default and can be omitted. (See also IMPLICIT). Example: -- ASN.1 source fragment (from GeneralNames assignment) ... dNSName [2] EXPLICIT IA5String, .... -- DER encoded as (hex) A2071605AAAAAAAAAA -- tag = 2 (02 hex) and with the Context Specific -- class (80 hex) and Constructed (20 hex = (80 \| 20 \| 02 = A2 hex) -- length = 7 (includes following type) -- embedded type tag is 22 (16 hex) = IA5String, length = 5 -- value is AAAAAAAAAA (arbitrary 5 hex octets for illustration) -- which according to the ASN.1 source should be interpreted by the decoder -- as representing a DNS name -- Note: EXPLICIT indicates use (the tag) and the format In many cases ASN.1 source modues defined in RFCs provide both an IMPLICIT and EXPLICIT version, implementors, one assumes, are therfore free to choose either. DER decoders should expect either format.
FROM	Used by the ASN.1 parser/DER encoder to limit the character set that can appear in UniversalType strings. Does not appear in any way in the DER encoding.
IDENTIFIED BY	Not strictly a Reserved/KEYWORD but a Literal (X.681 10.7). We added it to this list because...we added it to this list. It is a Literal used in the user-friendly(!) part (defined by WITH SYNTAX) for the TYPE-IDENTIFIER class (Information Object defined in X.681 Annex D). Essentially syntax sugar. Probably used by the ASN.1 parser and great for humans but plays no role in the DER encoding.
IMPLICIT	Used in conjunction with [x] syntax (TaggedType, X.680 30) defining Context Specific, APPLICATION or PRIVATE class types and indicates that an item of the appropriate class, containing the defined tag value x, will be DER encoded and replace the UniversalType of the element. The default for TaggedTypes is defined for the ASN.1 module using DEFINITIONS IMPLICIT TAGS \| EXPLICIT TAGS. In the absence of any such directive EXPLICIT is the default and can be omitted. (See also EXPLICIT). Example: -- ASN.1 source fragment (from GeneralNames assignment) ... dNSName [2] IMPLICIT IA5String, .... -- DER encoded as (hex) 8205AAAAAAAAAA -- tag = 2 (02 hex) and with the Context Specific -- class bits (80 hex) (80 \| 02) = 82 hex -- length = 5, -- value is AAAAAAAAAA (arbitrary 5 hex octets for illustration only) -- which according to the ASN.1 source should be interpreted by the decoder -- as an IA5String representing a DNS name -- IMPLICIT indicates the use (the tag) and the tag implies the format -- (in the EXPLICIT case the format is explicitly identified) -- IMPLICIT encoding is always shorter than EXPLICIT but requires more -- ASN.1 source knowledge In many cases ASN.1 source modues defined in RFCs provide both an IMPLICIT and EXPLICIT version, implementors, one assumes, are therfore free to choose either. DER decoders should expect either format.
IMPORTS	Supports multiple formats (as does everything in ASN.1) but the most used version indicates that one or more assigments (of UserTypes or ClassTypes) will be imported into an ASN.1 module FROM another ASN.1 module identified (by an OBJECT IDENTIFIER) and named (by a module name). Multiple FROM sequences may be used. The example shows a module definition using IMPORTS (taken from RFC 5912 section 14): PKIX1Explicit-2009 {iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-mod-pkix1-explicit-02(51)} -- this module name and OBJECT IDENTIFIER DEFINITIONS EXPLICIT TAGS ::= -- essentially defines a module as an assignment starting from -- BEGIN until END BEGIN IMPORTS -- indicates that one or more Type assignments will be imported Extensions{}, EXTENSION, ATTRIBUTE, SingleAttribute{} -- defines the list of assignments that will be inmported fron -- a defined ASN.1 module FROM PKIX-CommonTypes-2009 -- identifies the name of the ASN.1 module from which the assignments -- will be imported {iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-mod-pkixCommon-02(57)} -- identifies the OID of the module from which the assignments -- will be imported -- optionally, any number of import assignments and FROM definition .... END -- end of module
MAX	Used with the SIZE keyword to define any upper limits on certain Universal types/tags. A typical use is shown below: GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName -- the range (1..MAX) indicates the upper and lower limits, in this case -- there must be at least one item present in the SEQUENCE. The keyword MAX -- is not defined for SEQUENCE and indicates the sending implementor is free -- to choose a limit, the receiver can receive any number SIZE and any upper and lower limits are not encoded in the DER form (they are used only by the ASN.1 parser/DER encoder). Note: When SIZE is present the SEQUENCE ...... OF form must be used. When MAX is used with specific UniversalTypes it takes the value determined by that type.
MIN	Used with the SIZE keyword to define any lower limits on certain UniversalTypes. MIN has no defined value in ASN.1 but takes its value from the UniversalType it is associated with.
OCTET	Only appears in the context of OCTET STRING (a UniversalType).
OPTIONAL	Can appear on any line item within a SEQUENCE and always appears after the Type definition. Indicates that the item may, or may not, be present in the encoded form. Example: StupidType ::= SEQUENCE { one One, two Two OPTIONAL } One ::= blah Two ::= blah
PRIVATE	Only used with TaggedType ([x]). Indicates the tag is, well, private. When the implementor should use PRIVATE and when APPLICATION remains, to us, a mystery.
SIZE	Indicates that there is an upper and/or lower limit to to UniversalType it is associated with. A typical use is shown: GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName -- the range (1..MAX) indicates the upper and lower limits, in this case -- there must be at least one item present in the SEQUENCE. The keyword MAX -- is not defined for SEQUENCE (or anything else) and indicates the sending implementor is free -- to choose a limit, the receiver must be able to handle any number of items SIZE and any upper and lower limits are not DER encoded. They are used by the sending (encoding) implementor only. Note the use of the SEQUENCE ......OF form when SEQUENCE is used with SIZE and indicates that a SIZE constraint caan only be applied to items of the same Type.
SYNTAX	Confusing (we find most things confusing). It has two uses. First, in the form WITH SYNTAX is is used to introduce the user-friendly (!) syntax within a ClassType assignment (X.681 10). Second, as a Literal (X.681 10.6 and 10.7) used in the form SYNTAX only. (See an example of both uses in TYPE-IDENTIFIER.)
TYPE-IDENTIFIER	An Information Object (we refer to them as ClassTypes) that is builtin through its definition in X.681 Annex A. Its assigment is: TYPE-IDENTIFIER ::= CLASS { -- the & notation is an artifact of CLASS (Information Object class) definition -- but the case of the following name is significant (identifier or TypeName) &id OBJECT IDENTIFIER UNIQUE, &Type } WITH SYNTAX {SYNTAX &Type IDENTIFIED BY &id} -- SYNTAX and IDENTIFIED BY are user-friendly(!) Literals to allow -- the human to recognize what thing to use and where to put it (highly -- technical description), a kind of template -- allows definition of a ClassType which has the very common -- construct of an an OBJECT IDENTIER (id) and -- whose data content (SYNTAX) is Type which is determined by (IDENTIED BY) -- id (OBJECT IDENTIFIER). - Example usage (from GeneralName) OTHER-NAME ::= TYPE-IDENTIFIER -- so OTHER-NAME will consist of an OID (id) and content (Type) determined by the OID -- and valueReference using this could look like my-OtherName OTHER-NAME ::= { SYNTAX My-Name IDENTIFIED BY my-oid } -- note upper and lower case names first letters My-Name ::= IA5String my-oid OBJECT IDENTIFIER ::= {2 3 5 6} See also the UniversalType INSTANCE OF which is related and expands on this note.
UNIQUE	Does what it says on the tin. The ASN.1 parser will force every instance of the element in which it appears within a module to be different or ... unique. Only used with ClassType (Informtion Objects defined in X.681). See exmple usage in TYPE-IDENTIFIER. Only used by the ASN.1 parser, irrelevant to DER encoding.
WITH	Syntax sugar. Always appears with another word such as WITH SYNTAX or WITH COMPONENTS. The ASN.1 parser may find it useful and it makes sense to we humans but does not appear in any way in the DER encoding.

ASN.1 UniversalTypes and Tags

The UniversalTypes recognized by ASN.1. These TypeNames are also included (in the Capitalization form shown) in the ASN.1 Reserved/KEYWORDS List. (Full list of Reserved words in X.680 11.27.)

Name	Tag Decimal	Tag Hex	String	Primtive (P) Constructed (C)	Notes
BOOLEAN	1	01	N	P	False = 0 (00 hex), TRUE = sender discretion (other than 0!), but normally -1 (FF hex).
INTEGER	2	02	N	P	Defines an interger. Can have size and permissable value constraints applied such as: Stupid ::= INTEGER -- simple assignment Dumb ::= INTEGER SIZE(12..200) -- size constrained (used by encoder - invisible to decoder) Dozy ::= INTEGER { v1(0), v2(1), v3(2) } -- permissable value range (used by encoder - invisible to decoder) (see also REAL)
BIT STRING	3	03	N	P/C	While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which must be a BIT STRING) and in this case the initial item is Constructed (C). Can also be used to define bit significant values as in: SignificantBits ::= BIT STRING { firstBit (0) secondBit (1) thirdBit (3) -- others as required } -- the bit numbering is LEFT to RIGHT starting from 0 -- (not the normal ITU convention) -- allowing an infinitely large number of bits to be defined -- for example if only secondBit is set = 40 (hex)
OCTET STRING	4	04	Y	P/C	Nominally an unrestricted character set string type. While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which must be an OCTET STRING) and in this case the initial item is Constructed (C). Thus: -- Assume the OCTET STRING AAAAAA (3 octets) -- it could (would normally) be encoded 0403AAAAAA -- but it could be encoded (type is 04 = OCTET STRING with Constructed set (24 hex)) 2407 -- first substring 0401AA -- second substring 0402AAAA -- complete DER encoding using substring variant 24070401AA0402AAAA
NULL	5	05	N	P	It's, like, nothing. Encoded with a type of 5, length of 0 and no value. Sometimes used as a placeholder especially for Algorithm parameters for mostly historical reasons.
OBJECT IDENTIFIER	6	06	N	P	Frequently shortened to OID. An OID is a string of space (ASN.1) or dot (LDAP) separated numeric values (subidentifiers) uniquely identifying an entity. Thus, 2 5 4 3 (2.5.4.3 in LDAP) uniquely identifies the attribute commonName (cn). (Some LDAP specific information on OIDs.) X.690 8.19 defines the encoding in detail. Each numeric value (subidentifier is the X.690 term) is encoded separately (without any separator) except the first two numeric values (subidentifiers) which use the formula ((1 * 40) + 2). Thus, in our 2 5 4 3 example the first two values (2 5) would encode as (2 * 40) + 5 = 85 (55 hex) and the third and fourth (4 3) are encoded separately as 04 03 which results in the hex string 550403. Values <= 127 are encoded in a single octet, otherwise they must use two or more octets with all but the last having bit 8 (IETF bit 0) set. Thus, only 7 bits in each octet (bits 7 - 1 (ITU) or 1 - 7 (IETF)) are used to encode the subidentifiers. Assuming we want to encode the subidentifier 327 (147 hex) this would result in the two hex octets 8147 (see this detailed explanation)) Because OBJECT IDENTIFIER assignments define a specific value (the OID) they are always valueReferences. Example: id-data OBJECT IDENTIFIER ::= { iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs7(7) 1 } -- assignment uses the namedNumber format but also could be written as -- id-data OBJECT IDENTIFIER ::= { 1 2 840 113549 1 7 } -- DER encoding -- Type/Tag = 06, length = 9 0609 -- 1 2 part (1 * 40) = 40 + 2 = 42 (2a hex)) 2a -- 840 = 0348 hex part 8648 -- 113549 = 01bb8d hex part 86f70d -- 1 7 part 0107 - complete DER encoding 06092a864886f70d0107 This site will give you endless hours of fun with OIDs as will this one. Oh, the joy of OIDs.
ObjectDescriptor	7	07	Y	P	Nominally an unrestricted character set string type encoded as if it were an OCTET STRING. Human readable text describing some entity.
INSTANCE OF, EXTERNAL	8	08	N	C	Only applicable to a Type which assigns the ClassType (Information Object) TYPE-IDENTIFIER. The INSTANCE OF type expands to a SEQUENCE to carry the two items used by TYPE-IDENTIFIER (id and Type) or any Type derived from it: -- generic definition (Annex C X.681) SEQUENCE { type-id <DefinedObjectClass>.&id, value [0] <DefinedObjectClass>.&Type } -- Example from GeneralName assignment GeneralName ::= CHOICE { otherName [0] INSTANCE OF OTHER-NAME, -- other stuff omitted for simplicity } OTHER-NAME ::= TYPE_IDENTIFIER -- the INSTANCE OF OTHER-NAME is expanded to create SEQUENCE { type-id OTHER-NAME.&id, value [0] OTHER-NAME.&Type } -- Annex C X.681 assumes that the TaggedType [0] -- should always be EXPLICIT
REAL	9	09	N	P	Used for floating point and non base 10 numbers.
ENUMERATED	10	0A	N	P	Defines a named set that, in the absence of any numeric override, will be allocated an incremental integer value. Examples: Dumb ::= ENUMERATED { -- enumeration uses a simple identifier -- enumeration starts from zero (red = 0, blue = 1 and so on) red blue green } Dumb ::= ENUMERATED { -- enumeration uses a namedNumber format -- enumeration starts from 7 (red = 7, blue = 8 and so on) red(7) blue green } -- While the ASN.1 source ENUMERATED assignment may have a significant number of identfiers or namedNumbers, only a single integer will be DER encoded reflecting the selected item for the instance encoding in which the enumeration is referenced.
ENUMERATED PDV	11	0B	N	?
UTF8String	12	0C	Y	P/C	ISO/IEC 10646-1 character set (subsets in Annex A).
RELATIVE-OID	13	0D	N	P	Encoded in a similar manner to OBJECT IDENTIFIER but represents only a partial OID. The receiver must know (application specific) how to assemble a complete OID by combining this relative OID with some other OID previously encoded using some magical process. It can be viewed as a way of reducing data volumes where many OID extensions are used with a common OID base.
SEQUENCE, SEQUENCE OF	16	10	N	C	Precedes an ordered list of items of different types. While in strict terms SEQUENCE is an ordered list of items of different types and SEQUENCE OF is an ordered list of items of the same type, DER encoding treats SEQUENCE and SEQUENCE OF the same (they are encoded with the same DER tag value) though the ASN.1 parser may well enforce different rules. It is almost always written as SEQUENCE unless a SIZE is present in which case the OF is present. SEQUENCE items can delimited by {...} (braces or curly brackets) as a useful layout feature. SEQUENCE is encoded into DER and always has the constructed bit set.
SET, SET OF	17	11	N	C	Precedes an unordered list of items of different types. While in strict terms SET is an unordered list of items of different types and SET OF is an unordered list of items of the same type, DER encoding treats SET and SET OF the same (they are DER encoded with the same tag value) though any ASN.1 parser may apply rules to enforce the OF variant. It is almost always written as SET unless a SIZE is present in which case the SET OF variant is used. SET is delimited by {} (braces or curly brackets). SET is DER encoded and always has the constructed bit set (C).
NumericString	18	12	Y	P/C	Restricted character set string type. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 and SPACE. While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).
PrintableString	19	13	Y	P/C	Restricted character set string type. a-z, A-Z, ' () +,-.?:/= and SPACE (ASCII/IRA5 subset). While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).
TeletexString, T61String	20	14	Y	P/C	Restricted (ITU T.61 and T.101) character set string type. While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).
VideotexString	21	15	Y	P/C	Restricted character set (ITU T.100 and T.101) string type. While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).
IA5String	22	16	Y	P/C	Restricted character set string type. International Alphabet 5 (More correctly International Reference Alphabet No. 5 - IRA5) Also ISO 646 and ITU T.50. ASCII is the US national variant. While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).
UCTTime	23	17	Y	P	Nominally an unrestricted character set string type encoded as if it were an OCTET STRING.
GeneralizedTime	24	18	N	S	Nominally an unrestricted character set string type encoded as if it were an OCTET STRING.
GraphicString	25	19	Y	P/C	Restricted character set string type. IRA5 printable set (ISO 646/T.50) (See IA5String). While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).
VisibleString, ISO646String	26	1A	Y	P/C	Restricted character set string type. IRA5 Printable characters (basically excludes all Control Characters from ISO 646/T.50). While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).
GeneralString	27	1B	Y	P/C	Restricted character set string type. IRA5 printable set (ISO 646/T.50). While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).
UniversalString	28	1C	Y	P/C	ISO10646-1 character set (or Annex A subset). While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).
CHARACTER STRING	29	1D	Y	?	Specialized use, extended character set defined by OBJECT IDENTIFIER. Buy a lottery ticket if you come across this one. Bound to win.
BMPString	30	1E	Y	P/C	ISO/IEC 10646-1 Part 1: Architecture and Basic Multilingual Plane (subset defined in Annex A). While the majority of encodings will provide the whole string as a single Primitive (P) the standards allow strings to be constructed from substrings (each of which may be of a different type) and in this case the initial item is Constructed (C).

DER Overview (TLV)

DER (Distinguished Encoding Rules) is a classic TLV (Type, Length, Value) encoding scheme. It differs slightly from BER (Basic Encoding Rules) by removing some options to make it simpler.

X.690 should be consulted for the all DER encoding details. The X.690 2002 Standard is freely available from the ITU because it has been superceded by later versions, however, this version of the standard is used in current (2017) RFCs. As ITU documents go it is both eminently readable and even understandable (sometimes even at the first reading). This recommendation does not mean you will disappear down a rabbit hole.

The type field encoding, since it crucial to many of the concepts in this guide is provided and uses the same terminology as developed in this guide (hopefully). Some trivial notes are provided about length encoding that may clear up immediate questions. Or maybe not.

DER Type/Tag Encoding

Since the type value has significant implications for many of the concepts included in this survival guide its construction is explained in detail using the terminology developed in this guide. However, X.690 (2002) section 8.1.2, freely available from the IT, should be used as the definitive source.

The DER type encoding (X.690 refers to this as the identifier octets) is the first part of any DER encoding sequence. The type may consist of one or more octets. Tags values in the range 0 to 30 inclusive (00 to 1E hex)) use a single octet encoding. Tag values greater than 30 ( > 1e hex) use a two or more octet encoding.

DER Type - Single Octet Encoding

For tag values in the range 0 to 30 (inclusive), which covers all the UniversalTypes, the single octet is encoded as:

ITU Bit Number	8	7	6	5	4	3	2	1
IETF Bit Number	0	1	2	3	4	5	6	7
	Class		C/P	Tag Value
Universal	0	0	0 = Primitive 1 = Constructed	tag values 0 (00 hex) to 30 (1E hex)
Application	0	1
Context Specific	1	0
Private	1	1

Notes:

Class encoding corresponds to the TaggedType ([x] syntax) covering Context Specific, Application and Private classes. Universal class indicates the encoded item is a UniversalType
The UniversalType list indicates whether it is Constructed or a Primitive. Context Specific, Application and Private classes also use this bit when the EXPLICIT Reserved/KEYWORD is present or defaulted. Example of a Context Specific Constructed encoding.

DER Type - Multiple Octet type Encoding

Tag values greater than 30 (1e hex) use two or more octets to encode the tag value as shown:

placeholder

DER Length Encoding

X.690 (section 8.1.3) should be consulted for the detail encoding of item length. The X.690 2002 Standard is freely available from the ITU because it has been superceded by later versions, however, this version of the standard is used in current (2017) RFCs. As ITU documents go it is both eminently readable and even vaguely understandable.

X.690 defines two length types - definite and indefinite. DER only uses the definite method. Less to read.

The following notes may answer some immediate questions (or they may not):

The length field applies only to the Value part (content octets in X.690 terminology) of the TLV (Type, Length, Value) encoding and explicity excludes the type (one or more octets) and the length (one or more octets).
If the constructed bit is set in the type then the length includes all the items in the construction (for example, a SEQUENCE). Nominally the Value part of a constructed item is regarded as all the items in the construction.
Items whose value part (See note 1 above) is <= 127 (7F hex) can be encoded in a single octet. Length greater than 129 will be two or more octets (all but the last having ITU Bit 8 set) as decribed in X.690 section 8.1.3.5.

DER UniversalType Encoding

X.690 (sections 8.2 to 8.22) should be consulted for the detail encoding of UniversalTypes. The X.690 2002 Standard is freely available from the ITU because it has been superceded by later versions, however, this version of the standard is used in current (up to 2017) RFCs. As ITU documents go X.690 is both eminently readable and even vaguely understandable.

Some notes and encodings are included, where relevant, under each type within the UniversalType list and in the Reserved/KEYWORD list.

Change Log

The Page modified date at the foot of this page is always correct.

2nd, September, 2017: Initial page release with a number of missing elements in Reserved/KEYWORDS (mostly of an exotic or not common nature) and some missing descriptive notes in Universal Types.

Problems, comments, suggestions, corrections (including broken links) or something to add? Please take the time from a busy life to 'mail us' (at top of screen), the webmaster (below) or info-support at zytrax. You will have a warm inner glow for the rest of the day.