Expression syntax guidelines

Expressions are the key components of Endpoint Data Discovery (EDD) rules and are used during an EDD scan to detect specific file content.

An expression may be a single word, such as "confidential", or it may include a combination of words, variables, special characters, and operators, such as:

(Confidential W/1 (Data OR Image OR Document))

This expression will detect any of the following phrases included in a file:

confidential data
Confidential image
confidential Document

To get started creating your own custom EDD rules, you first need to know how to use special characters and operators in your expressions, and understand the general guidelines and limitations that apply.

To protect confidential data, use the @Mask_After and the @Mask_Upto operators to redact personal information.

Special characters

EDD rule expressions support the following special characters:

Character	Example	Description
?	???====	Match any single character [alphanumeric, - (dash), or _ (underscore)] When the example text is used in an expression, a match is detected if any 3 characters are followed by any 4 digits, such as D4B5321.
=	===-==-====	Match any single digit [numeric only] When the example text is used in an expression, a match is detected if any 9 digits are found in a file in the defined format, such as 123-45-6789.
*	in*	Match zero or more consecutive characters [alphanumeric, - (dash), and _ (underscore)] When the example text is used in an expression, a match is detected if the characters "in" are followed by zero or more characters, such as "in", "interest", or "in543".
~	borrowing~	Match all words that contain the root form of a word. When the example text is used in an expression, a match is detected if a word contains the root "borrow", such as "borrow", "borrowed", "borrows", or "borrowing".

Operators

EDD rule expressions support the following operators:

Operator	Example	Description
//	/(^\|\s)\d{3}-\d{3}-\d{4}(\s\|[,.:;]\|$)/	Regular expressions that match Perl-Compatible Regular Expression (PCRE) format expressions When the example text is used in an expression, a match is detected if a phone number in the format of nnn-nnn-nnnn is found in a file. The number must either be at the start of a new line or after whitespace and end with whitespace, a comma, a period, a colon, a semicolon, or be at the end of a line.
( )	Patient W/1 (ID OR record OR number) W/1??=====	Sets precedence and defines grouping
W/N	Interest W/2 (Rate OR Income)	Words or phases occur within N words of each other W/1 means the words are adjacent. When the example text is used in an expression, a match is detected if "Interest" is within 2 words of "Rate" or "Income" (either before or after). For example, the phrase "rate of interest" would generate a match.
W/0	(=== === === NOT W/0 (0== === === OR 8== === === OR 9== === ===))	Words or phrases occur within 0 words of each other "W/0" essentially means "this word" and is often used in conjunction with the NOT operator. When the example text is used in an expression, a match is detected if a string contains 9 digits in the defined format, but not if it begins with 0, 8 or 9.
NOT	(=== == ==== NOT W/0 000 == ====)	Negates the results of an expression You can use the NOT operator in conjunction with the W/N operator. When the example text is used in an expression, a match is detected if a string contains 9 digits in the defined format, but not if it begins with 000.
AND	Account AND (Number OR Statement)	To generate a match, all values must be matched. When the example text is used in an expression, a match is detected if either of the phrases "Account Number" or "Account Statement" are found in a file.
OR	(Number OR Nr OR Num) W/1 =======	To generate a match, one value must be matched. When the example text is used in an expression, if the phrase "Nr 5546372" is found in a file, a match is detected.
@Luhn	@luhn (==== ==== ==== ====)	To generate a match, the string must contain a valid Luhn algorithm Also known as the modulus 10 or mod 10 algorithm. A simple checksum formula used to validate identification numbers, such as credit card numbers. check digit. This operator is used in the Credit Card template. When the example text is used in an expression, a match is detected if a 16 digit number, in the defined format, contains a valid check digit.
@Date_Any_Alpha (<locale parameter>)	@Date_Any_Alpha (en_US)	To generate a match, the string must contain a valid alphanumeric date supported by the locale parameter. Alphanumeric dates use month names, such as September 7, 2018, or month abbreviations, such as 15 Aug 2018. When the example text is used in an expression, a match is detected if an alphanumeric date is found in a file in United States locale format.
@Date_Any_Numeric (all)	@Date_Any_Numeric (all)	To generate a match, the string must contain a valid numeric date in any supported numeric date format. Any of the following separators are supported: dash (-) forward slash (/) period (.) space Use of the "all" locale parameter may result in a large number of false positives A result on an EDD-related report or page in which a match is detected in a file, but upon further investigation, you do not consider the matched content to be at-risk data.. If the locale is known, use its applicable parameter instead.
@Date_Any_Numeric (`<locale parameter>`)	@Date_Any_Numeric (en_US)	To generate a match, the string must contain a valid numeric date in the format supported by the locale parameter. When the example text is used in an expression, a match is detected if a date in one of the following formats is found in a file: MM/DD/YYYY M/D/YY Any combination of the supported day, month, and year values
@Date_Specific_Numeric (all; YYYY; MM; DD)	@Date_Specific_Numeric (all; 2019; 01; 15)	To generate a match, the string must contain the date set in the operator in any supported numeric date format. When the example text is used in an expression, a match is detected if the date January 15, 2019 is found in a file in any supported numeric date format.
@Date_Specific_Numeric (`<locale parameter>`; YYYY; MM; DD)	@Date_Specific_Numeric (en_CA; 2012; 12; 31)	To generate a match, the string must contain a valid numeric date supported by the locale parameter and matching the date set within the operator. When the example text is used in an expression, a match is detected if the date December 31st, 2012 is found in a file in Canadian locale format.
@Mask_After( `n`; `<expression>`)	@mask_after( 5; (===-==-====)	Mask all characters after the <n>th character in a matched "word". This operator is used in many of the templates. When the example text is used in an expression, all characters after the 5th character are masked (replaced with asterisks), such as 785-6******.
@Mask_Upto( n; `<expression>`)	@mask_upto( 4; (==== ==== ==== ====))	Mask all characters in a matched "word", except the last <n> characters. When the example text is used in an expression, all characters up to the last four characters are masked (replaced with asterisks), such as ** ** 1747.
@FILENAME("`<expression>`")	@FILENAME("salary*.xlsx")	To generate a match, the string must be in the base name of the file, not part of the file path. The expression: must be in quotation marks ("") cannot contain other operators can only contain the ? and * special characters If ? or * are used with the @FILENAME operator, the expression must also contain additional search strings. Unlike other operations, matches are not limited to the supported file types. If a file with a matching name is found in any scanned document folder, it will report the match regardless of the file type. Rule matches cannot be tested in the Test Rule section When the example text is used in an expression, a match is detected if a file names starts with salary and ending with .xlsx.
@SHA256("`<hex-match>`")	@SHA256("f20b5ac95bfe13ade7afb380982252fe6c0cf0bf72f4032f775498e075ce39ce")	To generate a match, the hex-match (a 64-character hexadecimal string representing the SHA256 hash) must match the hash of the file's content The expression: must be in quotation marks ("") cannot contain other operators Unlike other operations, matches are not limited to the supported file types. If a file with a matching SHA256 file content match is found in any scanned document folder, it will report the match regardless of the type of file type. Rule matches cannot be tested in the Test Rule section

Due to the role of NOT, AND, and OR as operators, you can't use them as "words" in an expression.

More expression examples:

Expression	Definition	Examples of detected content
Asthma	Detects the word "Asthma" regardless of type case.	Asthma, asthma, ASTHMA
(West Nile W/3 (fever OR virus))	Detects the words "west nile" within 3 words of the words "fever" or "virus".	"fever caused by west nile"
(Student OR Student ID OR Student Identification Number OR Student Identification) W/1 ???====*	Detects any of the words [Student, Student ID, Student Identification Number, Student Identification] within one word of an identifier in the format [(3 characters) + (5 digits) + (any non-whitespace characters)].	Student HGF54793TY8A, Student ID xyz45976
NOT @SHA256("f20b5ac95bfe13ade7afb380982252fe6c0cf0bf72f4032f775498e075ce39ce") AND SSN ===-==-====	Detects the word SSN followed by a common US Social Security Number in the format 3 digits-2 digits-4 digits where the SHA256 hash of the file's content hash is not f20b5ac95bfe13ade7afb380982252fe6c0cf0bf72f4032f775498e075ce39ce	"SSN 000-11-1111" in a file who's SHA256 content hash is not f20b5ac95bfe13ade7afb380982252fe6c0cf0bf72f4032f775498e075ce39ce
NOT @FILENAME("salary*.xlsx") AND SSN ===-==-====	Detects the word SSN followed by a common US Social Security Number in the nine digit of format 3 digits-2 digits-4 digits where the file names doesn't match salary*.xlsx	"SSN 000-11-1111" in a file with the file name "benefits.xlsx"
/(^\|\s)[a-z]\d[a-z]-\d[a-z]\d(\s\|[,.:;]\|$)/	Detects a Canadian postal code in the format A1A-1A1 (case insensitive). The postal code must either be at the start of a new line or after whitespace and end with whitespace, a comma, a period, a colon, a semicolon, or be at the end of a line.	"V1J-4M6"
@filename(".qb")	Detects any files with the file extension that starts with qb, such as .qbw or .qbo. Frequently, QuickBooks file extensions start with .qb.	ABCCompany.qbo

Country specific operators

EDD rule expressions support the following country specific operators:

Operator	Example	Description
@JPIDNumber (Japan)	@jpidnumber (====-====-====)	This operator detects valid Japanese My Numbers. It is used in the My Number (Japan) template. When the example text is used in an expression, a match is detected if a 12 digit number, in the defined format, is a valid Japanese My Number.
@US_MBI (United States)	US_MBI (=??=-??=-??== OR =??=??=??==)	This operator detects valid US Medicare Beneficiary Identifiers. It is included in the General ID Samples template. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid US Medicare Beneficiary Identifier.
Countries in the European Economic Area (EEA) The EEA includes all European Union (EU) countries and also Iceland, Liechtenstein, and Norway. It defines the countries that participate in the European Single Market.
The operators in this section are used by the predefined GDPR Personal Data rule to find personal identifiers. They are also included in the GDPR Personal Data template.
@AT_TIN (Austria)	@AT_TIN (== ======= OR === ======= OR ========== )	This operator detects valid Austrian Tax Identification Numbers. When the example text is used in an expression, a match is detected if a 9 or 10 digit number, in the defined format, is a valid Austrian Tax Identification Number.
@BE_NN (Belgium)	@BE_NN (== == ==-=== == OR ===========)	This operator detects valid Belgian National Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Belgian National Number.
@BG_UCN (Bulgaria)	@BG_UCN (==========)	This operator detects valid Bulgarian National IDs. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Bulgarian National ID.
@CY_TIC (Cyprus)	@CY_TIC (========?)	This operator detects valid Cypriot Tax ID Codes. When the example text is used in an expression, a match is detected if a 9 character alphanumeric value, in the defined format, is a valid Cypriot Tax ID Code.
@CZ_SK_BN (Czech Republic and Slovakia)	@CZ_SK_BN (====== ==== OR ====== === OR ========== OR =========)	This operator detects valid Czech or Slovakian Birth Numbers. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Czech or Slovakian National Number.
@DE_IDNR (Germany)	@DE_IDNR (=========== OR == === === ===)	This operator detects valid German Taxpayer IDs. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid German Taxpayer ID.
@DE_SSN (Germany)	@DE_SSN (========?=== OR == ====== ? ===)	This operator detects valid German Social Security Numbers. When the example text is used in an expression, a match is detected if a 12 character alphanumeric value, in the defined format, is a valid German Social Security Number.
@DK_CPR (Denmark)	@DK_CPR (======-==== OR ==========)	This operator detects valid Danish Civil Registration System numbers. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Danish Civil Registration System number.
@EE_IK (Estonia)	@EE_IK (===========)	This operator detects valid Estonian Personal ID Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Estonian Personal ID Number.
@ES_DNI (Spain)	@ES_DNI (========-?)	This operator detects valid Spanish DNI (Documento Nacional de Identidad) numbers. When the example text is used in an expression, a match is detected if a 9 character alphanumeric value, in the defined format, is a valid Spanish Documento Nacional de Identidad.
@FI_HETU (Finland)	@FI_HETU (======-===? OR ======A===?)	This operator detects valid Finnish Personal ID Numbers. When the example text is used in an expression, a match is detected if an 11 character alphanumeric value, in the defined format, is a valid Finnish Personal ID Number.
@FR_NIR (France)	@FR_NIR (======?======== OR = == == =? === === ==)	This operator detects valid French Social Security Numbers. When the example text is used in an expression, a match is detected if a 15 character alphanumeric value, in the defined format, is a valid French Social Security Number.
@GR_AFM (Greece)	@GR_AFM (=========)	This operator detects valid Greek Tax Identifier numbers. When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Greek Tax Identifier number.
@GR_AMKA (Greece)	@GR_AMKA (===========)	This operator detects valid Greek Social Security Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Greek Social Security Number.
@GR_TAUTOTITA (Greece)	@GR_TAUTOTITA (?-====== OR ? ======)	This operator detects valid Greek Personal Identification Numbers. When the example text is used in an expression, a match is detected if a 7 character alphanumeric value, in the defined format, is a valid Greek Personal Identification Number.
@HR_MBG (Croatia)	@HR_MBG (===========)	This operator detects valid Croatian Matični broj građana (MBG). When the example text is used in an expression, a match is detected if a 13 digit number, in the defined format, is a valid Croatian MBG.
@HR_OIB (Croatia)	@HR_OIB (===========)	This operator detects valid Croatian Osobni identifikacijski broj (OIB). When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Croatian OIB.
@HU_PID (Hungary)	@HU_PID (======??)	This operator detects valid Hungarian Personal IDs. When the example text is used in an expression, a match is detected if a 8 character alphanumeric value, in the defined format, is a valid Hungarian Personal ID.
@HU_PIN (Hungary)	@HU_PIN (===========)	This operator detects valid Hungarian Personal Identification Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Hungarian Personal Identification Number.
@HU_TAJ (Hungary)	@HU_TAJ (=== === === OR ===-===-=== OR =========)	This operator detects valid Hungarian Social Security Identification Numbers (TAJ). When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Hungarian Social Security Identification Number.
@IE_PIN (Ireland)	@IE_PIN (=======? OR =======??)	This operator detects valid Irish Public Service Numbers. When the example text is used in an expression, a match is detected if an 8 or 9 character alphanumeric value, in the defined format, is a valid Irish Public Service Number.
@IS_KT (Iceland)	@IS_KT (========== OR ======-====)	This operator detects valid Icelandic Kennitala. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Icelandic Kennitala.
@IT_CF (Italy)	@IT_CF (??????==?==?===?)	This operator detects valid Italian Tax Code IDs. When the example text is used in an expression, a match is detected if a 16 character alphanumeric value, in the defined format, is a valid Italian Tax Code ID.
@LI_PIN (Liechtenstein)	@LI_PIN (============)	This operator detects valid Liechtensteiner Identity Cards. When the example text is used in an expression, a match is detected if a 12 digit number, in the defined format, is a valid Liechtensteiner Identity Card.
@LT_AK (Lithuania)	@LT_AK (===========)	This operator detects valid Lithuanian Personal Codes (Asmens Kodas). When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Lithuanian Personal Code.
@LU_TIN (Luxembourg)	@LU_TIN (==== == == === == OR =============)	This operator detects valid Luxembourgish Personal ID Numbers. When the example text is used in an expression, a match is detected if an 13 digit number, in the defined format, is a valid Luxembourgish Personal ID Number.
@LV_PK (Latvia)	@LV_PK (======-=====)	This operator detects valid Latvian Code numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Latvian Code number.
@MT_KTI (Malta)	@MT_KTI (=======?)	This operator detects valid Maltese Identity Card Numbers. When the example text is used in an expression, a match is detected if an 8 character alphanumeric value, in the defined format, is a valid Maltese Identity Card Number.
@NL_BSN (Netherlands)	@NL_BSN (========= OR ==== == ===)	This operator detects valid Dutch Citizen Service Numbers. When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Dutch Citizen Service Number.
@NO_FNR (Norway)	@NO_FNR (===========)	This operator detects valid Norwegian Personal ID Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Norwegian Personal ID Number.
@PL_PESEL (Poland)	@PL_PESEL (===========)	This operator detects valid Polish Personal Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Polish Personal Number.
@PT_NIC (Portugal)	@PT_NIC (=========??= OR ========-=??= OR ======== =??=)	This operator detects valid Portuguese Civil Identification Numbers. When the example text is used in an expression, a match is detected if a 12 character alphanumeric value, in the defined format, is a valid Portuguese Civil Identification Number.
@PT_NIF (Portugal)	@PT_NIF (=== === === OR ===-===-=== OR =========)	This operator detects valid Portuguese Tax Identification Numbers. When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Portuguese Tax Identification Number.
@PT_NISS (Portugal)	@PT_NISS (===========)	This operator detects valid Portuguese Social Security Numbers. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Portuguese Social Security Number.
@RO_CNP (Romania)	@RO_CNP (=============)	This operator detects valid Romanian Personal Numerical Codes. When the example text is used in an expression, a match is detected if a 13 digit number, in the defined format, is a valid Romanian Personal Numerical Code.
@SE_PIN (Sweden)	@SE_PIN (======-==== OR ======== ==== OR ============)	This operator detects valid Swedish Personal ID Numbers. When the example text is used in an expression, a match is detected if a 12 digit number, in the defined format, is a valid Swedish Personal ID Number.
@SI_EMSO (Slovenia)	@SI_EMSO (=============)	This operator detects valid Slovenian Master Citizen Numbers. When the example text is used in an expression, a match is detected if a 13 digit number, in the defined format, is a valid Slovenian Master Citizen Number.
@SK_COP (Slovakia)	@SK_COP (??====== OR ?? ======)	This operator detects valid Slovakian Citizen ID Numbers. When the example text is used in an expression, a match is detected if an 8 character alphanumeric value, in the defined format, is a valid Slovakian Citizen ID Number.
@UK_NHS	@UK_NHS (=== === ====)	This operator detects valid UK National Health Service numbers. When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid National Health Service number.
@UK_NINO	@UK_NINO (??======?)	This operator detects valid UK National Insurance Numbers. When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid UK National Insurance Number.

Operator

Example

Description

@JPIDNumber

(Japan)

@jpidnumber (====-====-====)

This operator detects valid Japanese My Numbers. It is used in the My Number (Japan) template.

When the example text is used in an expression, a match is detected if a 12 digit number, in the defined format, is a valid Japanese My Number.

@US_MBI

(United States)

US_MBI (=??=-??=-??== OR =??=??=??==)

This operator detects valid US Medicare Beneficiary Identifiers. It is included in the General ID Samples template. When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid US Medicare Beneficiary Identifier.

Countries in the European Economic Area (EEA) The EEA includes all European Union (EU) countries and also Iceland, Liechtenstein, and Norway. It defines the countries that participate in the European Single Market.

The operators in this section are used by the predefined GDPR Personal Data rule to find personal identifiers. They are also included in the GDPR Personal Data template.

@AT_TIN

(Austria)

@AT_TIN (== ======= OR === ======= OR ========== )

This operator detects valid Austrian Tax Identification Numbers.

When the example text is used in an expression, a match is detected if a 9 or 10 digit number, in the defined format, is a valid Austrian Tax Identification Number.

@BE_NN

(Belgium)

@BE_NN (== == ==-=== == OR ===========)

This operator detects valid Belgian National Numbers.

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Belgian National Number.

@BG_UCN

(Bulgaria)

@BG_UCN (==========)

This operator detects valid Bulgarian National IDs.

When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Bulgarian National ID.

@CY_TIC

(Cyprus)

@CY_TIC (========?)

This operator detects valid Cypriot Tax ID Codes.

When the example text is used in an expression, a match is detected if a 9 character alphanumeric value, in the defined format, is a valid Cypriot Tax ID Code.

@CZ_SK_BN

(Czech Republic and Slovakia)

@CZ_SK_BN (====== ==== OR ====== === OR ========== OR =========)

This operator detects valid Czech or Slovakian Birth Numbers.

When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Czech or Slovakian National Number.

@DE_IDNR

(Germany)

@DE_IDNR (=========== OR == === === ===)

This operator detects valid German Taxpayer IDs.

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid German Taxpayer ID.

@DE_SSN

(Germany)

@DE_SSN (========?=== OR == ====== ? ===)

This operator detects valid German Social Security Numbers.

When the example text is used in an expression, a match is detected if a 12 character alphanumeric value, in the defined format, is a valid German Social Security Number.

@DK_CPR

(Denmark)

@DK_CPR (======-==== OR ==========)

This operator detects valid Danish Civil Registration System numbers.

When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Danish Civil Registration System number.

@EE_IK

(Estonia)

@EE_IK (===========)

This operator detects valid Estonian Personal ID Numbers.

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Estonian Personal ID Number.

@ES_DNI

(Spain)

@ES_DNI (========-?)

This operator detects valid Spanish DNI (Documento Nacional de Identidad) numbers.

When the example text is used in an expression, a match is detected if a 9 character alphanumeric value, in the defined format, is a valid Spanish Documento Nacional de Identidad.

@FI_HETU

(Finland)

@FI_HETU (======-===? OR ======A===?)

This operator detects valid Finnish Personal ID Numbers.

When the example text is used in an expression, a match is detected if an 11 character alphanumeric value, in the defined format, is a valid Finnish Personal ID Number.

@FR_NIR

(France)

@FR_NIR (======?======== OR = == == =? === === ==)

This operator detects valid French Social Security Numbers.

When the example text is used in an expression, a match is detected if a 15 character alphanumeric value, in the defined format, is a valid French Social Security Number.

@GR_AFM

(Greece)

@GR_AFM (=========)

This operator detects valid Greek Tax Identifier numbers.

When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Greek Tax Identifier number.

@GR_AMKA

(Greece)

@GR_AMKA (===========)

This operator detects valid Greek Social Security Numbers.

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Greek Social Security Number.

@GR_TAUTOTITA

(Greece)

@GR_TAUTOTITA (?-====== OR ? ======)

This operator detects valid Greek Personal Identification Numbers.

When the example text is used in an expression, a match is detected if a 7 character alphanumeric value, in the defined format, is a valid Greek Personal Identification Number.

@HR_MBG

(Croatia)

@HR_MBG (===========)

This operator detects valid Croatian Matični broj građana (MBG).

When the example text is used in an expression, a match is detected if a 13 digit number, in the defined format, is a valid Croatian MBG.

@HR_OIB

(Croatia)

@HR_OIB (===========)

This operator detects valid Croatian Osobni identifikacijski broj (OIB).

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Croatian OIB.

@HU_PID

(Hungary)

@HU_PID (======??)

This operator detects valid Hungarian Personal IDs.

When the example text is used in an expression, a match is detected if a 8 character alphanumeric value, in the defined format, is a valid Hungarian Personal ID.

@HU_PIN

(Hungary)

@HU_PIN (===========)

This operator detects valid Hungarian Personal Identification Numbers.

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Hungarian Personal Identification Number.

@HU_TAJ

(Hungary)

@HU_TAJ (=== === === OR ===-===-=== OR =========)

This operator detects valid Hungarian Social Security Identification Numbers (TAJ).

When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Hungarian Social Security Identification Number.

@IE_PIN

(Ireland)

@IE_PIN (=======? OR =======??)

This operator detects valid Irish Public Service Numbers.

When the example text is used in an expression, a match is detected if an 8 or 9 character alphanumeric value, in the defined format, is a valid Irish Public Service Number.

@IS_KT

(Iceland)

@IS_KT (========== OR ======-====)

This operator detects valid Icelandic Kennitala.

When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid Icelandic Kennitala.

@IT_CF

(Italy)

@IT_CF (??????==?==?===?)

This operator detects valid Italian Tax Code IDs.

When the example text is used in an expression, a match is detected if a 16 character alphanumeric value, in the defined format, is a valid Italian Tax Code ID.

@LI_PIN

(Liechtenstein)

@LI_PIN (============)

This operator detects valid Liechtensteiner Identity Cards.

When the example text is used in an expression, a match is detected if a 12 digit number, in the defined format, is a valid Liechtensteiner Identity Card.

@LT_AK

(Lithuania)

@LT_AK (===========)

This operator detects valid Lithuanian Personal Codes (Asmens Kodas).

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Lithuanian Personal Code.

@LU_TIN

(Luxembourg)

@LU_TIN (==== == == === == OR =============)

This operator detects valid Luxembourgish Personal ID Numbers.

When the example text is used in an expression, a match is detected if an 13 digit number, in the defined format, is a valid Luxembourgish Personal ID Number.

@LV_PK

(Latvia)

@LV_PK (======-=====)

This operator detects valid Latvian Code numbers.

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Latvian Code number.

@MT_KTI

(Malta)

@MT_KTI (=======?)

This operator detects valid Maltese Identity Card Numbers.

When the example text is used in an expression, a match is detected if an 8 character alphanumeric value, in the defined format, is a valid Maltese Identity Card Number.

@NL_BSN

(Netherlands)

@NL_BSN (========= OR ==== == ===)

This operator detects valid Dutch Citizen Service Numbers.

When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Dutch Citizen Service Number.

@NO_FNR

(Norway)

@NO_FNR (===========)

This operator detects valid Norwegian Personal ID Numbers.

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Norwegian Personal ID Number.

@PL_PESEL

(Poland)

@PL_PESEL (===========)

This operator detects valid Polish Personal Numbers.

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Polish Personal Number.

@PT_NIC

(Portugal)

@PT_NIC (=========??= OR ========-=??= OR ======== =??=)

This operator detects valid Portuguese Civil Identification Numbers.

When the example text is used in an expression, a match is detected if a 12 character alphanumeric value, in the defined format, is a valid Portuguese Civil Identification Number.

@PT_NIF

(Portugal)

@PT_NIF (=== === === OR ===-===-=== OR =========)

This operator detects valid Portuguese Tax Identification Numbers.

When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid Portuguese Tax Identification Number.

@PT_NISS

(Portugal)

@PT_NISS (===========)

This operator detects valid Portuguese Social Security Numbers.

When the example text is used in an expression, a match is detected if an 11 digit number, in the defined format, is a valid Portuguese Social Security Number.

@RO_CNP

(Romania)

@RO_CNP (=============)

This operator detects valid Romanian Personal Numerical Codes.

When the example text is used in an expression, a match is detected if a 13 digit number, in the defined format, is a valid Romanian Personal Numerical Code.

@SE_PIN

(Sweden)

@SE_PIN (======-==== OR ======== ==== OR ============)

This operator detects valid Swedish Personal ID Numbers.

When the example text is used in an expression, a match is detected if a 12 digit number, in the defined format, is a valid Swedish Personal ID Number.

@SI_EMSO

(Slovenia)

@SI_EMSO (=============)

This operator detects valid Slovenian Master Citizen Numbers.

When the example text is used in an expression, a match is detected if a 13 digit number, in the defined format, is a valid Slovenian Master Citizen Number.

@SK_COP

(Slovakia)

@SK_COP (??====== OR ?? ======)

This operator detects valid Slovakian Citizen ID Numbers.

When the example text is used in an expression, a match is detected if an 8 character alphanumeric value, in the defined format, is a valid Slovakian Citizen ID Number.

@UK_NHS

@UK_NHS (=== === ====)

This operator detects valid UK National Health Service numbers.

When the example text is used in an expression, a match is detected if a 10 digit number, in the defined format, is a valid National Health Service number.

@UK_NINO

@UK_NINO (??======?)

This operator detects valid UK National Insurance Numbers.

When the example text is used in an expression, a match is detected if a 9 digit number, in the defined format, is a valid UK National Insurance Number.

General requirements and limitations

The following guidelines apply to expressions and expression sets:

Expressions, including regular expressions, are not case sensitive.

For example the expression "confidential w/1 Data" will yield the same results as the expression, "CONFIDENTIAL W/1 data".
Expressions can contain any number of nodes. Also see Node limitations.
Expressions can include the following special characters:
- Dash (-)
- Underscore (_)
- Supported special characters
  
  You will not receive a syntax error if you use an unsupported special character, but file content is not matched as expected. Therefore, to successfully match file content that contains an unsupported special character (for example, "A&W"), replace the special character with a space (for example, "A W"). When you view Matched Tokens after an EDD scan is run, a match to "A&W" shows as "A,W".
An expression set can contain an unlimited number of expressions, but it must contain at least one expression. If you delete all expressions from an expression set, delete the expression set.

However, if your rule fails to adhere to any of the following limitations you will need to correct each error before you can publish the rule:
The same string element (word) can't be used more than 64 times in an expression set.
A word is limited to 255 bytes in length (some UTF8 characters are 4 bytes in length).
Node limitations:
- A node is defined as a text string that is separated from the rest of the expression by an operator. If the expression contains a phrase with no operators, then the phrase is a node.
- A node is limited to 512 bytes in length.
- A node can't include more than eight (8) words. This includes words that use any of the following special characters: ?, =, *

Regular expressions

A regular expression, also know as regex, is a sequence of text that can be used for a pattern search to match content. In EDD, you can use the regular expression operator (//) in an expression to detect file content that might be difficult to match using the other expression operators. In addition, you can use a web search to find a regular expression that detects a particular piece of information your organization is interested in. Regular expressions can be very simple or very complex.

Regular expressions used as operators:

are not case-sensitive.
do not support flags after the second "/".

Due to the way EDD calculates match scores, the number of matches for a string may differ from the number of matches found using other online regex tools.

Make sure that you test your regular expressions carefully. The regular expression that you use may match the data you are interested in, but if you don't test it carefully, the EDD scan may detect an excessive number of matches and stop.

Example
You want to detect US phone numbers in the format nnn-nnn-nnnn. You use the following regular expression to detect 3 digits, a dash, 3 digits, a dash, and then 4 digits: Copy `/\d{3}-\d{3}-\d{4}/` You test the expression and find it successfully matches the number 555-555-1234 but you find that it also matches 6555-555-1234. You add (^\|\s) to the start of the expression so that the match has to be a the start of a new line or after whitespace. Copy `/(^\|\s)\d{3}-\d{3}-\d{4}/` You test the expression and find that it no longer matches 6555-555-1234 but you find that it does matches 555-555-123456789. You add (\s\|$) to the end of the expression so that the match has to be followed by whitespace or be at the end of a line. Copy `/(^\|\s)\d{3}-\d{3}-\d{4}(\s\|$)/` You test the expression and find that it no longer matches 555-555-123456789 but it doesn't match 555-555-1234: T Jordan (Home). You change the end of the expression from (/s\|$) to (\s\|[,.:;]\|$) so that the match has to be followed by whitespace, a comma, a period, a colon, a semicolon, or be at the end of a line. Copy `/(^\|\s)\d{3}-\d{3}-\d{4}(\s\|[,.:;]\|$)/` You test the expression and find that is successfully matches 555-555-1234: T Jordan (Home).

For more help creating expressions using the syntax supported in Endpoint Data Discovery Rules, contact Absolute Technical Support.