-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
do not merge - this is a concept - feat: acquisition checkpoint #216
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,10 +11,12 @@ | |
*/ | ||
const { sampleRUM } = window.hlx.rum; | ||
|
||
const basicHash = (string, modulo) => Array.from(string) | ||
.map((a) => a.charCodeAt(0)) | ||
.reduce((a, b) => a + b, 1) % modulo; | ||
|
||
const fflags = { | ||
has: (flag) => fflags[flag].indexOf(Array.from(window.origin) | ||
.map((a) => a.charCodeAt(0)) | ||
.reduce((a, b) => a + b, 1) % 1371) !== -1, | ||
has: (flag) => fflags[flag].indexOf(basicHash(window.origin, 1371)) !== -1, | ||
enabled: (flag, callback) => fflags.has(flag) && callback(), | ||
disabled: (flag, callback) => !fflags.has(flag) && callback(), | ||
onetrust: [543, 770, 1136], | ||
|
@@ -269,3 +271,48 @@ fflags.enabled('email', () => { | |
params.filter((param) => regex.test(param)).forEach((param) => sampleRUM('email', { source: network, target: param })); | ||
}); | ||
}); | ||
|
||
// acquisition checkpoint | ||
(() => { | ||
const sanitize = (str) => (str || '').toLowerCase().replace(/[^a-zA-Z0-9]/, ''); | ||
const toBinary = (s) => Array.from(s, (c) => parseInt(c, 16).toString(2).padStart(4, '0')).join(''); | ||
const moduli = [239, 241, 251]; // prime numbers smaller than 256 | ||
const knownVendors = toBinary('fbdef75ff9f4dedbfdeaba8f21e7884aebf67cfde6eefeea3b8ff32c6fb68a40'); // known vendors bloom filter | ||
const categories = { | ||
affiliate: ['aff', 'affiliate', 'affiliatemarketing'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since you check for string inclusion… the 1st actually covers the other 2 |
||
audio: ['spotify'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The things that people click on in Spotify are display ads. You can't click an audio clip. |
||
brand: ['brand'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Brand advertising is not an ad format, is is a type of ad. No point listing it here. |
||
display: ['advertorial', 'banner', 'cpa', 'cpc', 'cpm', 'cpv', 'discover', 'display', 'fbads', 'goppc', 'highimpact', 'inred', 'nps', 'paid', 'paiddisplay', 'placement', 'post', 'poster', 'pp', 'ppc'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All the cp* and pp* could be combined as a simple regex, and paid already covers paiddisplay (also covered by display), post covers poster, etc. |
||
email: ['em', 'email', 'mail', 'newsletter'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. email is already covered by em and mail |
||
local: ['yext'], | ||
owned: ['owned'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, not a category. |
||
qr: ['qr', 'qrcode'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. qrcode is covered by qr |
||
search: ['direct', 'google', 'googleflights', 'paidsearch', 'paidsearchnb', 'sea', 'sem'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cue @ramboz comment about |
||
sms: ['sms'], | ||
social: ['facebook', 'gnews', 'instagramfeed', 'instagramreels', 'instagramstories', 'line', 'linkedin', 'metasearch', 'organicsocialown', 'paidsocial', 'social', 'sociallinkedin', 'socialpaid'], | ||
video: ['native', 'paidvideo', 'pvid', 'video', 'youtube'], | ||
web: ['webapp'], | ||
}; | ||
const sources = { | ||
paid: ['affiliate', 'audio', 'display', 'local', 'search', 'social', 'video'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shopping is a source I see often that I miss here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see enough values to justify this. The most common I see is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @trieloff Shopping is not an actual value you find necessarily in the params, it's a category you can derive from the hostname/vendor, same as social in a sense or search. Shopping is usually used for e-commerce sites. So think big websites like Amazon/Etsy/Alibaba/Rakuten/…, or anything based on Shopify/Magento/Squarespace/… So I'd kinda expect in shopping: ['alibaba', 'amazon', 'bestbuy', 'ebay', 'flipkart', 'otto', 'rakuten', 'target', 'walmart', …], We can probably also throw in a few European ones in there, like |
||
owned: ['brand', 'email', 'owned', 'qr', 'sms', 'web'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. push notifications could be added here as well |
||
}; | ||
// these 'vendors' appear differently in the utmsource field. They are mapped to a single value: | ||
const vendorMappings = [ | ||
{ regex: /newsshowcase|aci|google|googleads|gads|google-ads|google_search|google_deman|aw|adwords|dv360|gdn|doubleclick|dbm|gmb/i, result: 'google' }, | ||
{ regex: /instagram|ig/i, result: 'instagram' }, | ||
{ regex: /face|fb|meta/i, result: 'facebook' }, | ||
{ regex: /email/i, result: 'email' }, | ||
{ regex: /bing/i, result: 'bing' }, | ||
{ regex: /amazon|ctv/i, result: 'amazon' }, | ||
{ regex: /qr/i, result: 'qrcode' }, | ||
{ regex: /youtube|yt/i, result: 'youtube' }, | ||
]; | ||
const utmMedium = sanitize(new URLSearchParams(window.location.search).get('utm_medium')); | ||
const utmSource = sanitize(new URLSearchParams(window.location.search).get('utm_source')); | ||
Comment on lines
+311
to
+312
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This assumes that medium and source are used as different values that carry actual information. I don't think this is the case, and I'd just treat them as one. |
||
const preVendor = vendorMappings.find(({ regex }) => regex.test(utmSource))?.result || utmSource; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The optional chaining limits browser compatibility. I don't think we use it anywhere else in the code base. |
||
const category = Object.keys(categories).find((key) => (categories[key] || []).includes(utmMedium)) || ''; | ||
const source = Object.keys(sources).find((key) => (sources[key] || []).includes(category)) || ''; | ||
const vendor = moduli.every((modulo) => knownVendors.charAt(basicHash(preVendor, modulo)) === '1') ? preVendor : ''; | ||
sampleRUM('acquisition', { source: `${source}:${category}:${vendor}` }); | ||
})(); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
/* | ||
* Copyright 2024 Adobe. All rights reserved. | ||
* This file is licensed to you under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. You may obtain a copy | ||
* of the License at http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software distributed under | ||
* the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR REPRESENTATIONS | ||
* OF ANY KIND, either express or implied. See the License for the specific language | ||
* governing permissions and limitations under the License. | ||
*/ | ||
|
||
const moduli = [239, 241, 251]; | ||
|
||
const basicHash = (string, modulo) => Array.from(string) | ||
.map((a) => a.charCodeAt(0)) | ||
.reduce((a, b) => a + b, 1) % modulo; | ||
|
||
const binaryToText = (binaryString) => { | ||
return parseInt(binaryString, 2).toString(16); | ||
}; | ||
|
||
// known vendors | ||
const vendors = [ | ||
'adlocus', | ||
'admitadmonetize', | ||
'aftership', | ||
'amazon', | ||
'attentive', | ||
'avivid', | ||
'baidu', | ||
'banner', | ||
'bing', | ||
'blis', | ||
'cheetah', | ||
'cj', | ||
'clarin', | ||
'clm', | ||
'criteo', | ||
'demandgen', | ||
'digidip', | ||
'digitalremedycom', | ||
'discovery', | ||
'display', | ||
'eloqua', | ||
'email', | ||
'eminent', | ||
'facebook', | ||
'famoussmokeshopinc', | ||
'fark', | ||
'fashionistatop', | ||
'feedotter', | ||
'flipboard', | ||
'flyer', | ||
'geniusmonkey', | ||
'giftcardmall', | ||
'google', | ||
'hotstar', | ||
'hrs', | ||
'hsemail', | ||
'inmobicom', | ||
'inred', | ||
'insider', | ||
'instagram', | ||
'integrateddisplay', | ||
'internal', | ||
'line', | ||
'linkbux', | ||
'linkedin', | ||
'linkinbio', | ||
'locationpage', | ||
'lveng', | ||
'm2trans', | ||
'manutd', | ||
'marketo', | ||
'massiva', | ||
'mavenintent', | ||
'mediamond', | ||
'mentionme', | ||
'microsoft', | ||
'native', | ||
'newsletter', | ||
'nexus', | ||
'openweb', | ||
'optum', | ||
'outbrain', | ||
'outlook', | ||
'partner', | ||
'partnerstudentbeanscom', | ||
'petcademy', | ||
'pinterest', | ||
'pmax', | ||
'programmatic', | ||
'programmaticgdn', | ||
'pushly', | ||
'qrcode', | ||
'reddit', | ||
'redone', | ||
'retailercode', | ||
'seznam', | ||
'shopfully', | ||
'silverpop', | ||
'sky', | ||
'skyscanner', | ||
'snapchat', | ||
'spotify', | ||
'substack', | ||
'taboola', | ||
'teads', | ||
'thetradedesk', | ||
'tiktok', | ||
'tradedesk', | ||
'tradetracker', | ||
'twitter', | ||
'web', | ||
'yahoo', | ||
'yandex', | ||
'yext', | ||
'yieldkit', | ||
'youtube', | ||
]; | ||
|
||
// Initialize 256 chars long array filled with initial zeros | ||
const bloomFilter = new Array(256).fill(0); | ||
|
||
// Insert each vendor into the Bloom filter | ||
vendors.forEach((vendor) => { | ||
moduli.forEach((modulo) => { | ||
const hash = basicHash(vendor, modulo); | ||
bloomFilter[hash] = 1; | ||
}); | ||
}); | ||
|
||
const f = bloomFilter.reduce((acc, _, index) => (index % 4 === 0 ? [...acc, bloomFilter.slice(index, index + 4).join('')] : acc), []) | ||
.map((binaryChar) => binaryToText(binaryChar)) | ||
.join(''); | ||
|
||
console.log(`Bloom Filter: ${f}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could probably simplify even further with a regex approach.