VerityData API – Technical Details¶

General Overview¶

The VerityData API allows authorized users to query directly from VerityData's vast database of both processed and unprocessed SEC filings.

This documentation outlines API usage instructions, filing structure, response fields, and also touches on how filings are internally processed for accuracy.

Additional information and sample API calls can be found at https://www.infilings.com/api/.

VerityData Coverage Summary¶

Filing Type	Structured Filings	Structured Sections
10-K	40,000+	13,000,000+
10-Q	100,000+	15,000,000+
IPO Filing	5,000+	3,500,000+
Proxy	30,000+	4,250,000+

Our coverage includes over 5,500 companies, and includes most 10-Ks filed since 2013, and most IPOs since 2016.

Pre-authorization Requirements¶

Accessing the VerityData API requires an authorization token provided by VerityData. Please contact your Customer Success Manager for information and pricing related to obtaining a token.

Format¶

API requests must be submitted using the POST method with a JSON payload to: https://www.infilings.com/api/api.php.

Requests must include the user's numerical VerityData id, authorization token, action, and any arguments required for the action.

API Requests¶

The following details various response objects and "core" actions available via the API, along with their respective argument options. Information on additional actions can be found at: https://www.infilings.com/api/api.php.

Response Objects¶

Field	Type	Description
status	string	Status of the response: ok: No issues in the response. fail: Response failed. partial: Filing within the response encountered an error. MultiFilingContent and Search actions only
filings	array	Object containing data for the individual filings included in the response.
incompleteData	boolean	Shows True if the request encountered an error on a single filing but still includes complete data for preceding filings included in the request. In this case, the last filing in the response encountered an error and has incomplete data. MultiFilingContent and Search actions only.
reason	string	Description of error, if error occurred.

Filings Object¶

Field	Type	Description
ticker	string
CUSIP	string
companyname	string
iacc	int	VerityData identifer for the filing.
cik	int	SEC CIK for the filer.
allfilerciks	int[]	All CIK's included in sec filing submission header.
accessionnum	string	EDGAR accession number.
formtype	string
datefiled	date
filedtimestamp	timestamp	Date & time when the filing was received by the SEC.
received	timestamp	Date & time when the filing was received by VerityData.
spotapproved	boolean	True if the filing has been processed and structured by VerityData.
spotapproveddate	timestamp	Date & time when filing was approved.
seclink	URL	Link to view filing on EDGAR.
documentperiodenddate	date	Document period date taken from DocumentEntityInformation.
documentfiscalperiodfocus	string	Fiscal period for filing taken from DocumentEntityInformation.
documentfiscalyearfocus	int	Fiscal year for filing taken from DocumentEntityInformation.
toc	array	Array containing "table of contents" related information.
sections	array	Array containing data for each section within the filing (body, title, etc.)
incompleteData	boolean	Shows True if an error occurred while processing the IACC. All filings in the response with no value in this field are valid with complete data. MultiFilingContent and Search actions only.

Sections Object¶

Field	Type	Description
sectionid	integer	Identifier of the individual section in the respective filing.
prevsectionid	integer	Identifier of the corresponding section from the prior filing.
itemid	integer	Identifier for the parent item of the section. You can consider an item as a ‘container' for one or more sections.
tlitem	string	The "Top-Level" item which contains the section. Top-level items are fundamentally based on the SEC's required items for a given form-type.
intro	boolean	'True' if the section is the first section of its parent item else 'False'. Often, intro sections will only contain a title for the item and nothing more, but that is not always the case.
title	string	The title for the section.
filingorder	integer	Numerical identifier indicating relative position within the filing. Smallest values are at the beginning of the filing while largest are at the end.
changetype	char	This is the type of change for the respective item or section using our proprietary change algorithm: N: New Disclosure D: Deleted Disclosure U: Unchanged Disclosure F: Full / Major Change B: Big Chang M: Medium Change S: Small Change
boilerplate	boolean	'True' if the text within the Top Level Item, Item, or Section can be considered “boilerplate”, else 'False'. This distinction is made using a combination of automated textual analysis and manual analyst review.
itemidpath	[integer]	Array listing all itemids which contain the section. Mutliple itemids indicate that the section is located within nested items.
tags	[string]	Array of categorical tags for the section.
body	string	The complete text of the respective section or item. There are four different output versions for the body content: Plain: this is a plain text version of the disclosure text and scrubs the output of any HTML or formatting from the file. For disclosures that contain tables, we remove the HTML and include the raw contents of each table cell in the output. HTML: this is the ‘original’ HTML of the source filing for the respective disclosure text. Diff: this is the ‘original’ HTML of the source filing for the respective disclosure text, along with our proprietary change algorithm output. See more details below on the Diff output including the stylesheet definitions to better understand how to denote added and deleted text. Machine: this is a JSON representation of the Diff of the body scrubbed of any HTML and punctuation; we also ‘stem’ the output of words (using the industry standard snowball stemmer). For disclosures that contain numeric tables (typically tables where a majority of cells are numbers or have color banding), we insert a [NUMERIC TABLE] marker into the output where the table was located. The output takes into account our proprietary “diffing” software to allow users to easily see what text is added, removed, and changed. This allows users to focus on the core text and facilitate textual analysis, machine learning, and more. See more below on the machine output. Plain Machine: this is a JSON representation of the tokenized and stemmed “plain” body. As with the “Machine” output, words are tokenized using the snowball stemmer.

TOC Object (Table of Contents)¶

Field	Type	Description
order	[string]	Array listing the Top Level Items, in order, for the filing.
.tlitem.[name].itemtitle	string	Title for the Top Level Item. These titles typically match our internal name for the Top Level Item.
.tlitem.[name].itemid	integer	The itemid for the Top Level item.
.tlitem.[name].level	integer	The Top Level Item's level within the filing. (Top Level Items are always Level 1)
.tlitem.[name].parent	string	Parent item title for the Top Level Item. (By defenition, Top Level Items have no parent items, so will yield a 'None' value)
.tlitem.[name].tlitem	string	The name of the Top Level Item. (By definition, yields itself)
.tlitem.[name].content	[string]	Array listing all items and sections contained within the Top Level item, along with various table of content identifiers. For items, output includes: itemtitle itemid level parent tlitem content - array which lists the sections within the item and their toc identifiers For sections, output includes: itemid title sectionid intro

API Calls / Actions¶

FilingList¶

Outputs a list of filings meeting the argument criteria. Response includes Filings Object.

Arguments¶

Argument	Type	Description
companyid	int or string	Identifer for the company. Accepted identifiers are: Ticker, CIK, CUSIP
dateStart	date or timestamp	Start date for the query. (Inclusive)
dateEnd	date or timestamp	End date for the query. (Inclusive)
dateType	string	The date/time to use for dateStart/dateEnd arguments. Valid values are "datefiled", "sec_received", "infilings_received", and "reviewdate". Defaults to "datefiled"
reviewed	boolean	If True, only return reviewed filings. If False, only return unreviewed filings. If null or not present, return filings of either status.
mcapLow	int	Lower bound for optional marketcap filter.
mcapHigh	int	Upper bound for optional marketcap filter.
formTypes	string[]	Array of formtypes to include.
limit	int	Output limit. Defaults to 100.

FilingContent¶

Get full or partial contents of a filing. Response includes Filings Object (shows as "filing" instead of "filings" for this call), TOC Object, and Sections Object.

Arguments¶

Argument	Type	Required	Description
iacc	int	Y	VerityData identifier for the filing
body	string	Y	Body output type (plain/html/diff/machine/plainmachine). More information on these output types is included below.
tlitem	string		Limit the response to a specific Top Level item.
includeexhibits	boolean		Include exhibit content in response. Defaults to False
changetype	[char]		Limit response to sections of given change type(s).
includedeletes	boolean		True to include deleted sections, False to exclude. Only for plain/html/machineplain outputs.

MultiFilingContent¶

Get full or partial results from multiple filings, using a combination of arguments from the FilingList and FilingContent actions. The FilingList arguments select the filings to retrieve, the FilingContent arguments select the output formats. Response includes Filings Object, TOC Object, and Sections Object. Note - the toc and sections objects are located within the filings object for this call.

Arguments¶

Argument	Type	Required	Description
companyid	int or string		Identifer for the company. Accepted identifiers are: Ticker, CIK, CUSIP
multiiaccs	[int]		Array of IACCs to be included in the request. Maximum of 500.
dateStart	date or timestamp		Start date for the query. (Inclusive)
dateEnd	date or timestamp		End date for the query. (Inclusive)
dateType	string		The date/time to use for dateStart/dateEnd arguments. Valid values are "datefiled", "sec_received", "infilings_received", and "reviewdate". Defaults to "datefiled"
reviewed	boolean		If True, only return reviewed filings. If False, only return unreviewed filings. If null or not present, return filings of either status.
mcapLow	int		Lower bound for optional marketcap filter.
mcapHigh	int		Upper bound for optional marketcap filter.
formTypes	[string]		Array of formtypes to include.
body	string	Y	Body output type (plain/html/diff/machine/plainmachine). More information on these output types is included below.
tlitem	string		Limit the response to a specific Top Level item.
includeexhibits	boolean		Include exhibit content in response. Defaults to False
changetype	[char]		Limit response to sections of given change type(s).
includedeletes	boolean		True to include deleted sections, False to exclude. Only for plain/html/machineplain outputs.
limit	int		Output limit. Defaults to 100. Ignored when using the multiiaccs argument.

Search¶

Call results for Curated and saved searches from the inFilings web platform via the API. Note - searches cannot be created through the API, the API can only call results for searches which were previously created and saved on the web platform.

Arguments¶

Argument	Type	Required	Description
serachid	int	Y	ID for the search you'd like to use. This can be obtained from the web platform's URL when running the saved search via the web platform (i.e. infilings.com/search.php?searchid=XXXXXX)
company id	string		Limit search results to specific company identifier (ticker, CIK, or CUSIP)
limit	int		Maximum number of results (defaults to 75, max is 75)
page	int		For use when more than 75 results (or set limit value) are found by the search. A 2 in this field paired with a default limit of 75 will show reults 75-150, 3 will show results 150-225, etc.
body	string		Body output type (plain/html/diff/machine/plainmachine).
timeframe	int		Override for the timeframe set in the saved search. The integer value is the number of days for the search to include.
sort	string		The sort order for the search results. Possible values are score (match score) or ftstamp (date filed). Default is ftstamp.

Response¶

Field	Type	Description
ticker	string
CUSIP	string
companyname	string
iacc	int	VerityData identifer for the filing.
cik	int	SEC CIK for the filer.
formtype	string	Form-type of the search results.
datefiled	date
filedtimestamp	timestamp	Date & time when the filing was received by the SEC.
mcap	float	Market-cap for the company.
boilerplate	boolean	'True' if the text within the Top Level Item, Item, or Section can be considered “boilerplate”, else 'False'. This distinction is made using a combination of automated textual analysis and manual analyst review.
title	string	The title for the section.
filingorder	integer	Numerical identifier indicating relative position within the filing. Smallest values are at the beginning of the filing while largest are at the end.
itemid	integer	Identifier for the parent item of the section. You can consider an item as a ‘container' for one or more sections.
reviewed	boolean	True if the filing has been processed and structured by VerityData.
tlitem	string	The "Top-Level" item which contains the section. Top-level items are fundamentally based on the SEC's required items for a given form-type.
snippets	string	Snippet of text matching the search parameters

TableOfContents¶

Get the Table of Contents for a filing. Response includes TOC Object.

Arguments¶

Argument	Type	Required	Description
iacc	int	Y	VerityData identifier for the filing
tlitem	string		Limit the response to a specific Top Level item.

RiskFactorCounts¶

Get various Risk Factor statistics for a group of companies. The arguments for company selection is similar to the FilingList action's arguments. Response detailed below.

Arguments¶

Argument	Type	Description
companyid	int or string	Identifer for the company. Accepted identifiers are: Ticker, CIK, CUSIP
dateStart	date or timestamp	Start date for the query. (Inclusive)
dateEnd	date or timestamp	End date for the query. (Inclusive)
dateType	string	The date/time to use for dateStart/dateEnd arguments. Valid values are "datefiled", "sec_received", "infilings_received", and "reviewdate". Defaults to "datefiled"
reviewed	boolean	If True, only return reviewed filings. If False, only return unreviewed filings. If null or not present, return filings of either status.
mcapLow	int	Lower bound for optional marketcap filter.
mcapHigh	int	Upper bound for optional marketcap filter.
formTypes	string[]	Array of formtypes to include.
latest	boolean	Limit counts to the latest spot-approved 10-K / 10-Q for the company.
limit	int	Output limit. Defaults to 100.

Response¶

Field	Type	Description
estdatefiled	date
unusual	integer	Count of risk factors marked as "unusual" by our system. This attribute is determined using a combination of the filer's disclosure history as well as broader macro reporting trends.
new	integer	Count of new risk factors. See 'changetypes' within the section object: Fields* table for more information on this change-type.*
deleted	integer	Count of deleted risk factors. See 'changetypes' within the section object: Fields* table for more information on this change-type.*
risk_totalcount	integer	Total count of risk factors disclosed in the filing.
totalchange	integer	Total count of changed risk factors in the filing.
risk_unchangedcount	integer	Total count of unchanged risk factors disclosed in the filing.
bigchange	integer	Total count of risk factors with "big" changes. See 'changetypes' within the section object: Fields* table for more information on this change-type.*
mediumchange	ingeger	Total count of risk factors with "medium" changes. See 'changetypes' within the section object: Fields* table for more information on this change-type.*
smallchange	integer	Total count of risk factors with "small" changes. See 'changetypes' within the section object: Fields* table for more information on this change-type.*
tinychange	integer	Total count of risk factors with "tiny" changes. See 'changetypes' within the section object: Fields* table for more information on this change-type.*
wordcount	integer	Count of words within the Risk Factors Top Level Item (stop words excluded).
wordcountpct	double-precision	Risk Factor word count as a percentage of total words in all Top Level Items (excluding Exhibits).
jaccard	double-precision	Computed jaccard score. See Lazy Prices academic study for more information.
topadded	[JSON]	JSON output listing the most added words in the filing's Risk Factor disclosure along with their added count. Added words are determined by comparing the filing's Risk Factor disclosure to prior Risk Factor disclosures by the company (comparisons are made up to and including the prior 10-K). The word is identified by the 'w' header in the JSON.
topdeleted	[JSON]	JSON output listing the most deleted words in the filing's Risk Factor disclosure along with their delete count. The methodology and output is the same as the topadded field above.

Filing Structure and Top Level Items¶

10-K and 10-Q Top Level Items¶

The following table includes a list of Top Level Items in 10-K and 10-Q filings. Some Top Level Items are 10-K only, some are shared between 10-Ks and 10-Qs, and some are 10-Q only.

Top Level Item	Internal Name	10-K Location	10-Q Location	Boilerplate
Business Description	businessdesc	Item 1
Risk Factors	risk	Item 1A	Part II – Item 1A	Yes
Unresolved Staff Comments	unresolved	Item 1B	Yes
Properties	properties	Item 2	Part II – Item 1
Legal Proceedings	legal	Item 3	Yes
Mine Safety Disclosures	minesafety	Item 4	Part II – Item 4	Yes
Executive Officers	execofficers	Item 4A
Stockholder Matters	stockholdermatters	Item 5
Selected Financial Data	selectfinancial	Item 6
Management Discussion and Analysis	*mgmtdiscussion	Item 7	Part I – Item 2
Quantitative Disclosures About Market Risk	quantdisclosures	Item 7A	Part I – Item 3	Yes
Critical Accounting Policies	criticalaccounting	Item 7CA VerityData Specific	Part 1 – Item 2CA VerityData Specific	Yes
Financial Statements	finstatements	Item 8	Part I – Item 1
Footnotes	notestoconsolidated	Item 8N VerityData Specific	Part 1 – Item 1N VerityData Specific
Accountant Changes and Disagreements	acctdisagreements	Item 9	Yes
Controls and Procedures	controlprocs	Item 9A	Part I – Item 4	Yes
Other Information	otherinfo	Item 9B	Part II – Item 5	Yes
Disclosure Regarding Foreign Jurisdictions that Prevent Inspections	inspectionprevention	Item 9C	Yes
Directors, Officers, and Corporate Governance	dirandexecs	Item 10	Yes
Executive Compensation	execcompensation	Item 11	Yes
Security Ownership of Related Stockholder	beneowners	Item 12	Yes
Certain Relationships and Related Transactions	relations	Item 13	Yes
Accountant Fees	acctfees	Item 14	Yes
Exhibits	finstatementschedules	Item 15	Part II – Item 6
Summary	summary	Item 16
Unregistered Sales	unregisteredsales	Part II – Item 2	Yes
Defaults Upon Senior Securities	seniordefaults	Part II – Item 3	Yes

IPO Filing Top Level Items¶

The following table includes a list of Top Level Items in IPO filings. We process the following types of IPO filings: DRS, DRS/A, S-1, S-1/A and 424B4. As you’ll see in the table, most of the Top Level Items are specific to VerityData.

Top Level Item	Internal Name	IPO Location	Additional Notes
Prospectus	prospectus	VerityData Specific
Risk Factors	risk	VerityData Specific	The company’s first 10-Q or 10-K filing is compared against the company’s 424B4 for the most accurate changes
Selected Financial and Operating Data	selectfinancial	VerityData Specific
Management Discussion and Analysis	mgmtdiscussion	VerityData Specific	The company’s first 10-K filing is compared against the company’s 424B4 for the most accurate changes
Critical Accounting Policies	criticalaccounting	VerityData Specific	The company’s first 10-K filing is compared against the company’s 424B4 for the most accurate changes
Quantitative Disclosures About Market Risk	quantdisclosures	VerityData Specific
Business	businessdesc	VerityData Specific	The company’s first 10-K filing is compared against the company’s 424B4 for the most accurate changes
Regulation / Legal Maters	legalmatters	VerityData Specific
Management	management	VerityData Specific
Financial Statements	finstatements	VerityData Specific	The company’s first 10-K filing is compared against the company’s 424B4 for the most accurate changes
Footnotes	notestoconsolidated	VerityData Specific	The company’s first 10-K filing is compared against the company’s 424B4 for the most accurate changes
Other Expenses of Issuance and Distribution	otherexpenses	Item 13
Indemnification of Directors and Officers	directors	Item 14
Recent Sales of Unregistered Securities	unregistered	Item 15
Exhibits	finstatementschedules	Item 16	Exhibits are not included in our standard file; contact us if you are interested in Exhibits
Undertakings	undertakings	Item 17
Other	other

Top Level Items - Additional Information¶

Overview¶

10-K and 10-Q filings generally follow a consistent outline as required by the SEC. We refer to the largest blocks as Top Level Items, which correspond to "Item" filing elements, e.g.

Item 1. Business Description
Item 1A. Risk Factors
Item 2. Properties
etc.

To build our Top Level Items structure, our proprietary software looks for markers that identify the beginning of each Top Level Item in the filing. See Page 6 of this document for a complete list of Top Level Items available in 10-Ks and 10-Qs.

It’s important to note that filers don’t always follow the standard Top Level Item structure. Top Level Items are often placed in Exhibits or other sections. In rarer cases a company may incorporate content from an annual report "by reference," requiring us to parse that additional document. Our SEC Specialists check the Top Level filing consistency of every filing to ensure the correct overall structure.

VerityData Top Level Items¶

To allow clients easy access to important disclosures, we’ve created a handful of proprietary Top Level Items. These include Item 7CA. Critical Accounting Policies and Item 8N. Footnotes. We restructure filings for consistency: e.g. If a filing has footnotes in the Exhibits, we would move them into Item 8N. Footnotes.

IPO Filings¶

IPO filings follow a less consistent overall structure than 10-K and 10-Q filings. They have fewer explicitly defined “Item X. Item Name” breakout within the filing. However, companies are generally consistent in disclosure names, like “Prospectus”, “Risk Factors” and so on.

When processing these IPO filings, we synthesize appropriate VerityData-specific Top Level Items. See the table above for a complete list of IPO Top Level Items.

Companies with Multiple Financials or Footnotes¶

For companies with subsidiaries or complex corporate structures that file multiple sets of Financial Statements and/or Footnotes, we create unique items for each set. We will include the name of the subsidiary within the title. Each set is considered independent of other sets, and will be matched with the corresponding set in any prior filings.

Detail Scopes¶

Below the Top Level Items, we structure the filings into "Sections," which consist of a title and the specific disclosure text. Sections are often collected into categories we call "Items." Items are collections of Sections, and may or may not have text of their own.

This structure is generated by our sophisticated software, which parses titles and callouts in the original filing, and can even identify structure by the way the filing's text is formatted.

For clarity, the software may re-organize the filing into more consistent Top Level Items, Items, and Sections, but it will maintain the vast majority of the original text, tables, and formatting. Tables, in particular, will be maintained in their original form.

Threading(Matching) between Filings¶

Most of the threading (matching) is done on like filings within the same Top Level Item. For example, Business Description in a 10-K is compared to Business Description in the prior 10-K. We call out important disclosures like Forward-Looking Statements to ensure accurate threading.

Inter-filing section matches do not need to be exact: the algorithms are able to adapt to wording changes, timeframe/period changes, and often even titles which have been renamed entirely. Human SEC specialists continually evaluate its performance.

Some sections are particularly sensitive to matching, these are described below.

With new data, there will not be existing data to match it to. When we determine a title was not disclosed in the prior filing, we classify it as a "New Disclosure."

Our technology intelligently uses prior filings to help it find matching data in new ones. Additionally, our team of SEC specialists review threads and edit the item-level and section-level matches when necessary. This allows us to capture changes in disclosures that may dramatically change locations or changes in titles from one filing to the next, and provides data to continually improve the software matches.

A special note as it relates to threading sections:
Our SEC Specialists attempt to structure specific disclosures across a company’s history, but sometimes: * a specific section or disclosure dramatically changes stucture between filings, or * a company significantly changes the Top Level Item in which it makes a specific disclosure

In these cases, a similar disclosure may appear as New in one place and Deleted inanother. Such events are rare, but we consider them a mis-categorization, and are constantly working to eliminate them.

Threading in "Risk Factor" and "Critical Accounting Policy"¶

The SEC has specific disclosure requirements for Risk Factor and Critical Account Policy updates. We analyze both 10-Ks and 10-Qs to accurately identify changes. For IPOs, we also include the company's final 424B4 IPO prospectus in the analysis.

For Risk Factors, we break out every individual risk factor by its title and description. Since these often change title and disclosure order. Some companies will also only list new or changed risk factors in their 10-Q disclosures. Because we have the 10-K, 10-Q, and IPO filings available, our software can account for these changes and provide the full list, calling out true "New" disclosures, as well. As with other matches, risk factor matches are resilient in the face of wording changes in the titles or descriptions.

If you want to do detailed analysis on risk factors, ask us about our Risk Factor specific feed offerings. The Lazy Prices study highlighted changes in risk factors as an opportunity for alpha. With VerityData you’ll have access to more granular data and specific modeling opportunities.

Management Discussion & Analysis (MD&A) Threading:¶

MD&A sections typically includes disclosures related to 3-, 6-, and 9-month results, depending on the fiscal period of the 10-Q filing. The software takes care to match like periods; this typically requires matching the same period from the previous fiscal year's filing, not the different-length period of the previous filing.

For example, a "6-month" results disclosure will often be compared to the "6-month" disclosure from three 10-Qs (one year) ago. This prevents comparing periods of different length, or mis-identifying the period as being a new disclosure. Note that this applies only to 10-Q MD&A disclosures; 10-Ks are always compared to the prior 10-K.

IPO Filing Threading:¶

We thread IPO filings by starting with the company’s initial IPO-related filing. A typical IPO thread will consist of Draft Registration Statement (DRS), to DRS/A(s), to the S-1, to S-1/A(s), through the company’s final 424B4 Prospectus. Except as noted above for Risk Factors, we thread disclosures within the same IPO Top Level Items. When a company files its first 10-K, we will thread the Footnotes, Business Description, MD&A, and more back to the company’s 424B4 to allow for more context when analyzing the 10-K.

Change Scores¶

Once we've matched sections and items to their equivalents in previous filings, we generate a Change Score for each one. This reflects the amount of change between a disclosure and the same disclosure in a previous filing; the higher the Change Score, the greater the difference (and likely the greater the importance of the change).

The algorithms for calculating these differences are extremely sophisticated, built on the world of Eugene Myers. Punctuation differences; different phrasing; different forms of the same word; dates, ages, time periods, page numbers; items marked as "copy"; and other mundane or expected changes will not be significantly affect the Change Score. Similarly, "small" changes with large meanings will affect it. These include things like opposite phrases (e.g. "will" to "will not"), changes in dollar or share values, the addition or removal of numeric tables, etc.

We do not incorporate changes in sentiment into the Change Score.

Other Linguistics Analysis¶

Boilerplate Designation¶

A number of Top Level items contain non-specific, general language: including short text like ‘None’ or ‘Not Applicable’. We classify these disclosures as "boilerplate." If a Top Level Item changes from boilerplate to actual content, the change type will appear as "New," to capture the fact that it is a new disclosure. Additionally, the change type for Subsequent Event disclosures in 10-K and 10-Q filings with the new content will also appear as New.

IPOs filings are handled slightly differently, Subsequent Event disclosures will display the respective change type; companies will often update their Subsequent Event disclosures as they update their IPO filings.

SEC Specialists review boilerplate settings to ensure accuracy. The boilerplate setting is especially helpful for Other Information and Subsequent Event disclosures, allowing you to focus analysis on disclosures that contain content. See more information the section on Top Level Items, above.

Tags¶

We ‘tag,’ or categorize, important disclosures such as Forward-Looking Statements and Critical Audit Matters. This allows for consistent threading across filings. Tags are created by utilizing Machine Learning and other proprietary software. SEC Specialists review tags as necessary. New tags are added over time.

Historical filing changes¶

Sometimes changes work both ways, and an older filing will be changed and re-approved in our system to make matching with modern filings more consistent. A change like this would result in re-approval(s) of the historical filing(s).

For example, in its most recent filing, a company decides to break a previously large section into smaller disclosure sections. In these cases, we may change the historical filings to more accurately thread (match) the disclosures.

As we add more Tags to the system, we may also need to modify the filing structure. If Tags are added, or if the filing structure is modified, this will require a re-approval of the filing.

Similarly, as more tags are added, these may be retroactively applied and result in historical filings being re-approved.

Additional Output Information and Examples¶

More on the Diff Output Body Type:¶

If you want to create a front-end application, the Diff output is likely the best output to use. It includes the ‘original’ HTML and the results of our proprietary software that identifies changes. At the end of the document, we’ve included details on stylesheet and classes you’ll see in the JSON:

More on the Machine Output Body Type:¶

If you are looking to perform advanced textual analysis and machine learning, we suggest that you review the Machine output files first.

Their output is an array of objects. Each object has a Command (add / delete / copy) followed by a list of words.

In the example below:

word1 and word2 were added
word3 was unchanged
word4 and word5 were deleted

"diff": \[  
    {  
    "command": "add",  
    "words": \[  
    "word1",  
     "word2"  
    \]  
    },  
    {  
    "command": "copy",  
    "words": \[  
    "unchanged",  
    "word3"  
    \]  
    },  
    {  
    "command": "delete",  
    "words": \[  
    "word4",  
     "word5"  
    \]  
    }  
\]

If the add command also has an “oldwords” key associated in the same diff, it is considered an ‘unimportant’ replacement – it is almost always a date or time period replacement. You would replace the “oldwords” with the current “words”.

Here’s also some sample code that would make it easy to feed added text into say, a RNN for learning purposes to see what’s commonly added:

for each obj in machineDiffOutput

  if obj.command = 'add' then
    addedWords += obj.words.length;
  end if;

  if obj.command = 'delete' then  
    deletedWords += obj.words.length;
  end if;

end for;

print "Added: " + addedWords + " - Deleted " + deletedWords + " Words";

Diff Class Details and Stylesheet¶

See these descriptions for the more impactful classes within the Diff output:

Class	Description
.opAdd:	Denotes added text as a ‘block’ add – these are cases where viewing changes inline is too difficult to read. You can think of it the same as the .opChangeAdd for all intents and purposes.
.opChangeAdd:	Denotes added text ‘inline’ within the paragraph.
.opDelete:	Denotes deleted text as a ‘block’ delete, similar to the .opAdd as noted above.
.opChangeDelete:	Denotes deleted text ‘inline’ within the paragraph.
.opDeletedTable:	The class for deleted tables.
.opTooltip:	The class for tool-tips; we show tool-tips date / year / quarter changes.
.opOpposite:	Denotes ‘opposite’ changes based on a proprietary dictionary, i.e. are -> were, gain -> loss.
.opPositiveGain:	We display number changes differently – this is for number changes that we calculate to be positive changes.
.opNegativeGain:	We display number changes differently – this is for number changes that we calculate to be negative changes.
.opSmallCopy:	We use this class for short amounts of text between adds and deletes; it is purely for readability, so these unchanged words don’t get lost.

Class Stylesheet:¶

    .opAdd  
    {  
   background-color: \#EAFCD9 !important;   
    }  
    .opDelete  
    {  
   color: \#757575 !important;  
    }  
    td.opDelete  
    {  
   text-decoration: line-through;  
    }  

    .opDelete .strikeBorder  
    {  
   border: solid 1px \#eee;  
    }  

    .opDelete .strike  div  
    {  
   text-decoration: line-through;  
    }  

    .opDelete table  
    {  
   color: \#757575 !important;  
    }  

    .opDeletedTable   
    {  
   opacity: 0.75;  
   border: solid 1px \#c2c2c2;  
   margin-top: 5px;  
   padding-left: 5px;  
   padding-right: 5px;  
    }  
    .opDeletedTable \> div   
    {  
   padding: 5px;  
    }  

    .opDeleteNoBG  
    {  
   color: \#999999 !important;  
    }  

    .opChangeDelete  
    {  
   color: \#999999 !important;  
    }  
    .opChangeAdd  
    {  
   background-color: \#EAFCD9 !important;  
    }  
    .opTooltip  
    {  
   background-color: \#DEEDFF !important;  
   border-top: solid 1px \#cccccc;  
   border-bottom: solid 1px \#cccccc;  
    }  
    .opTooltip ins  
    {  
   text-decoration: none;   
    }  
    .opOpposite  
    {  
   background-color: \#F0A3B4 !important;  
   font-weight: bold;  
    }  

    .opPositiveGain  
    {  
   background-color: \#EAFCD9 !important;  
    }  
    .opPositiveGain ins  
    {  
   text-decoration: none;  
    }  
    .opNegativeGain  
    {  
   background-color: \#FFD1DE !important;   

    }  
    .opNegativeGain ins  
    {  
   text-decoration: none;  
    }  

    .opSmallCopy  
    {  
   background-color: \#FFF4E3;  
    }

Please contact datafeeds@verityplatform.com with any questions