Using Feeds¶
Note on file names¶
Most feeds that have filenames including the date (e.g. "infilings_daily-20220801.txt"), also generate a daily file without the date (e.g. "infilings_daily.txt"). These files are otherwise identical, the one without the date is provided to simplify automated acquisition of these files.
Processing¶
Full/Partial history¶
-
Available as <filing>.zip files
-
After you’ve developed your model, the history file can be regenerated daily on an ongoing basis. Each file will contain the most up-to-date complete history of our database. The new file is available for daily pickup after 7:30am.
-
Each file will supersede the previously generated file, since the feed is considered snapshot data.
-
Within the .zip, the CSV will be titled \
-YYYYMMDD. YYYYMMDD will be the quarter-end date of the current calendar quarter (i.e. 20190630).
New Inserts¶
-
available at: <filing>-YYYYMMDD.txt
-
The new insert file is available for daily pickup after 7:30am. The file includes all transactions that were disclosed the day prior.
-
Each row in this daily insert file should be inserted as new row into your database. These daily inserts add to the historical file that was initially provided.
Edits¶
-
available at: <filing>_edits-YYYYMMDD.txt
-
The edit file is also available for daily pickup after 7:30am. This file includes transactions which have an insert older than the day prior but were edited the day prior. These would mainly include transactions which were already in your database but have been edited.
-
To update a transaction in the edit file, locate the “Edit Action” of the transaction:
-
If the Edit Action = “U”
- Find the row with the matching txnid in your database.
- Either replace the entire row with the row from the edit file or mark the previously existing row as Inactive and add the edited row into your database as Active.
-
If no matching txnid is found in your database, consider this row a new insert. This is an extremely rare case.
- If the Edit Action = “D”
-
Find the row with the matching txnid in your database.
-
Either delete the row entirely or mark the row as inactive.
-
This row should no longer be considered in any models.
- If the Edit Action = “I”
-
Check if there is a matching txnid in your database.
-
If there is no matching txnid, consider this row an insert into the database.
-
If there is a matching txnid, the txnid was considered Inactive and should now be considered Active. This is an extremely rare case.
-
- Find the row with the matching txnid in your database.
-
Parsing¶
Some information on parsing these files is provided here.
"CSV" Files¶
These files provide data formatted as a delimited text stream or file, often colloquially referred to as "Comma-Separated Value (CSV)" data.
Despite the name, the default delimiter for this data is the vertical bar "|" character, also called "pipe" in some technical fields. The vertical bar character was chosen because it is extremely uncommon in actual filing text. Each "row" of data appears on one line; fields which may contain line breaks or the delimeter character will have those removed to ensure clean parsing.
Note: Customers with specific parsing requirements can request a different delimiter character.
Otherwise, data is provided in a form as close as possible to that of the original filings: in particular, string data is not enclosed in quotation marks unless those marks are part of the actual filing text.
The first line of each feed is a header row, with headers delimited in the same way as data. This provides an in-data reference to what each field represents. Since some of the headers are abbreviated from complex strings and can be a little cryptic, a more descriptive name for each of them is shown in the documentation provided here. For example, the header "plansharepctofout" is used for "plan shares as a percentage of outstanding shares."
Any given row may have empty fields (several of them are optional, and sometimes even non-"optional" data was nevertheless omitted from the filing). Even if the field is empty, the delimiters are still present, so "data 1||data 3" indicates that whatever field is between "data 1" and "data 3" is not present.
Data can be customized to adapt to customer needs; some customers may elect not to receive all fields in all fields.
Human Viewing of "CSV" Feed Data¶
The feed data files, despite being ".txt" files, aren't particularly convenient for human reading. For human analysis, any modern spreadsheet—such as Microsoft Excel, Apple Numbers, or Google Sheets—can read and display delimiter-separated data in a much better formatted form:
ticker|cusip|companyname|sector|mcap|effectivedate|sourcedate|insertdate|formtype|plan_id|offertype|shareclass|other_class_title|updatetype|dollars|shares|remainingdollars|remainingshares|plansharespctofout|remainingsharespctofout|returnprior3mo|return3mo|commission|upto|useofproceeds|useofproceeds_notes|other_notes|sales_agents|lastupdate|rowid|split_edit|edit_action
AGLE|00773J103|Aeglea BioTherapeutics Inc.|15|92370490.0|20220520|20220520|20220520|424B5|2960|D|Common||I|60000000|39735099.3|60000000|39735099.3|64.96|0|-49.24||3|Y|General business related purposes/Raise working capital/General research and development/Commercial launch/Product development/Investments in interest bearing / debt securities|||JonesTrading Institutional Services|20220520|O2960||
AVGR|053734885|Avinger Inc|15|11302900.0|20220520|20220520|20220520|424B5|2961|D|Common||I|7000000|3517587.9|7000000|3517587.9|61.93|0|-58.48||3|N|General business related purposes/Raise working capital/Acquisitions / Business combinations/Repayment of debt/General research and development|Research and development of our Lumivascular platform products||H.C. Wainwright|20220520|O2961||
SMIT|806870200|Schmitt Industries|18|18746048.0|20220520|20220520|20220520|424B5|2964|D|Common||I|5000000|1020408.2|5000000|1020408.2|26.67|0|-5.21||3|N|General business related purposes/Raise working capital/Acquisitions / Business combinations/Investments in interest bearing / debt securities|||Roth Capital Partners|20220520|O2964||
STOK|86150R107|Stoke Therapeutics|15|534939678.0|20220520|20220520|20220520|S-3|1464|D|Common||R|||103688306|7596212.9|28.04|0|-28.04||3|Y|General business related purposes/Raise working capital/General research and development|||Stifel/Cantor Fitzgerald|20220520|R2120||
STOK|86150R107|Stoke Therapeutics|15|534939678.0|20220520|20220520|20220520|S-3|2963|D|Common||I|150000000|10989011|150000000|10989011|28.04|0|-28.04||3|Y|General business related purposes/Raise working capital/Acquisitions / Business combinations/Repayment of debt/General research and development/Investments in interest bearing / debt securities|||Cantor Fitzgerald|20220520|O2963||
VNRX|928661107|VolitionRX|15|137848251.0|20220520|20220520|20220520|424B5|2962|D|Common||I|25000000|9765625|25000000|9765625|18.14|0|-12.87||3|N|General business re
becomes this (exact formatting will depend on your spreadsheet application):
You will typically need to provide the delimiter character ("|" vertical bar, unless a customer requests something else) as part of the conversion process, but it is otherwise automatic.
Parsing "CSV" Feed Data¶
More commonly, some automated process parses the data as part of some larger workflow. This documentation is primarily intended to support that process. The exact process will depend on the computer language, tools, and processes used to parse, but here are some general guidelines:
- Parse the header row. Each row of data will be formatted with it's fields in the same order as the headers. While Verity changes the output format of these feeds very rarely, it does occasionally happen, and using header offsets rather than fixed indices will make your parsing robust in the face of added or deleted columns. For example, "market cap" will appear in the same position as the "mcap" header did in the header row.
- We strive to provide the cleanest data available. In the event of a failed parse (because of "bad" data), please preserve the input file and provide it to Verity for analysis. Simple corruption sometimes occurs during download, storage, or editing, but occasionally "unparsable" files can be a symptom of changes in the data we receive from the SEC or other sources
"Edits" feeds¶
Most feeds that describe non-permanent events have an "-edits" version available, as well. This adds a couple of columns (described with the main feed), that indicate when changes are made. These are indicated by a "U" (update) field, which can be used to completely replace the existing entry for that row, or a "D" (delete) field, indicating the row is no longer valid.
These edits happen for various reasons: plans are opened, closed, re-opened, or fundamentally changed in some way. Stock splits can also cause edits.
JSON Data¶
JSON stands for "JavaScript Object Notation," although its usage now extends far beyond its JavaScipt origins. JSON data is less compact than CSV data, but easier to read because it associates the proper header/label with each individual data field. More importantly, JSON allows structured data: that is, objects with complex sub-elements, element arrays containing variable numbers of elements, and similar real-world structures.
The 10K and 10Q feed data is provided as zip-compressed directories of JSON files. JSON is used because the structure of this data is often highly variable (typically dependent on how individual companies interpret and fill out SEC filing forms.)
Human Viewing of JSON Data¶
JSON Data is surprisingly human-readable. Here's partial data for a 10K form (the full data is too long to show here.)
{ "status": "ok", "filing": {
"ticker": "JKHY",
"cusip": 426281101,
"companyname": "Jack Henry & Associates",
"iacc": 39012422,
"cik": 779152,
"accessionnum": "0000779152-22-000076",
"formtype": "10-K",
"datefiled": "2022-08-25 00:00:00",
"filedtimestamp": "2022-08-25 16:59:58",
"received": "2022-08-25 16:59:58.985098",
"spotapproved": "true",
"spotapproveddate": "2022-08-25 17:31:46.889713",
"allfilerciks": [
779152
],
"documentperiodenddate": "2022-06-30 00:00:00",
"documentfiscalperiodfocus": "FY",
"documentfiscalyearfocus": 2022,
"seclink": "https:\/\/www.sec.gov\/Archives\/edgar\/data\/779152\/000077915222000076\/0000779152-22-000076-index.html"
} , "toc": {
"order": [
"businessdesc",
"risk",
"unresolved",
"properties",
"legal",
"minesafety",
"stockholdermatters",
"selectfinancial",
"mgmtdiscussion",
"criticalaccounting",
"quantdisclosures",
"finstatements",
"notestoconsolidated",
"acctdisagreements",
"controlprocs",
"otherinfo",
"inspectionprevention",
"dirandexecs",
"execcompensation",
"beneowners",
"relations",
"acctfees",
"finstatementschedules",
"summary"
],
"tlitem": {
"businessdesc": {
"itemtitle": "businessdesc",
"itemid": 127552122,
"level": 1,
"parent": null,
"tlitem": "businessdesc",
"content": [
{
"section": {
"itemid": 127552122,
"title": "businessdesc",
"sectionid": 630220655,
"intro": "true"
}
},
{
It's often easier to look at the raw data like the above, but if you'd like something a little "prettier," there are numerous tools and web sites out there that will present JSON data to you in various ways.
Note: Be careful of using these JSON formatting sites for data you consider proprietary.
Parsing JSON Data¶
On the automation side, things are similarly rosy. JSON is the de-facto data formatting mechanism for digital data on modern systems. It is extremely likely that whatever language, operating system, or tool you are using to build your automation already supports JSON parsing, and if not, it's a virtual certainty that libraries are available. It should be unnecessary to write JSON parsing code yourself. If your system turns out to be that very rare exception, please contact Verity support; we often have other ways to provide the data to you.
JSON Data Definitions¶
The forms provided in JSON do not have detailed field definition tables in this document. This is because the data is highly variable (not all companies fill out their filings in the same way), and because structured, nested data like this is difficult to present in textual formats.
Verity strives to choose good tag and label names, but if there's are some that aren't clear, contact Verity support for more information.