readme.md 14.8 KB
Newer Older
Aral Balkan's avatar
Aral Balkan committed
1
# Better
Aral Balkan's avatar
Aral Balkan committed
2

Aral Balkan's avatar
Aral Balkan committed
3
Better protects you from unethical web sites. It makes your web experience safer, lighter, and faster.
Aral Balkan's avatar
Aral Balkan committed
4

Aral Balkan's avatar
Aral Balkan committed
5
Better enforces the [Ethical Design Manifesto](https://ind.ie/ethical-design). It helps the Web respect human rights, effort, and experience.
Aral Balkan's avatar
Aral Balkan committed
6

7
Better is curated by Ind.ie, a tiny two-person-and-one-husky social enterprise striving for social justice in the digital age. Better is free, open, and transparent.
Aral Balkan's avatar
Aral Balkan committed
8

Aral Balkan's avatar
Aral Balkan committed
9
## Content
Aral Balkan's avatar
Aral Balkan committed
10

Aral Balkan's avatar
Aral Balkan committed
11
This repository contains the Better content: Better’s database of information on trackers and other malware as well as the web sites that host them.
Aral Balkan's avatar
Aral Balkan committed
12

Aral Balkan's avatar
Aral Balkan committed
13
This content is in Blockdown format. Blockdown is an extension of Markdown with special vocabulary to describe web malware. Blockdown can also contain WebKit content blocking rules. The Blockdown pages in Better’s content repository both describe web malware and contain the rules to block them.
Aral Balkan's avatar
Aral Balkan committed
14

Aral Balkan's avatar
Aral Balkan committed
15
This content is processed by [Better Builder](https://source.ind.ie/better/builder) to generate the [Better web site](https://better.fyi) as well as the data for the [Better iOS App](https://source.ind.ie/better/app), including a WebKit `blockerList.json` file.
Aral Balkan's avatar
Aral Balkan committed
16

Aral Balkan's avatar
Aral Balkan committed
17
A seminal advantage of Better is that its database is human-readable, open, and extensible via pull requests. (The database is curated by Ind.ie using the Ethical Design Manifesto as the criteria for blocking rules.)
Aral Balkan's avatar
Aral Balkan committed
18

19
Contributing to the content is as easy as creating an account on [source.ind.ie](https://source.ind.ie) and editing a content page in your browser.
20

Aral Balkan's avatar
Aral Balkan committed
21
## I’m not a developer, I just want to experience a Better web.
22

23
[Get Better from the App Store.](https://itunes.apple.com/us/app/better-by-ind.ie/id1080964978?mt=8)
24

Aral Balkan's avatar
Aral Balkan committed
25 26
## How can I support Better?

Laura Kalbag's avatar
Laura Kalbag committed
27
Buying [Better on the App Store](https://itunes.apple.com/us/app/better-by-ind.ie/id1080964978?mt=8) is one way to support us. If you want to help with the ongoing costs of developing and maintaining Better, you can [donate to Ind.ie](https://ind.ie/fund/) or, even better, [become a patron](https://ind.ie/fund/) by setting up a recurring donation.
Aral Balkan's avatar
Aral Balkan committed
28 29 30

## I’m a developer, let me in!

31 32
The easiest way to get started is to follow the instructions in the readme for the [Better iOS app](https://source.ind.ie/better/app) repository.

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
## Testing locally.

[Better Builder](https://source.ind.ie/better/builder) will automatically pick up your changes as you save and rebuild your local data.

To persist your changes locally, commit them in Git and push to origin:

```bash
git commit -am "My awesome content update"
git push origin master
```

Note that these changes will be destroyed if you run the Better Builder installer (or the Better iOS installer, which runs the Better Builder installer as part of its own installation process). To not lose any work, save your changes regularly by pushing to production, as explained below.

## Saving your changes by pushing to production

You can push to production with:

```bash
./save
```

Or, manually run what the save script does, which is:

```bash
git push live master
```

## Deployment

Aral Balkan's avatar
Aral Balkan committed
62 63 64
Before you can deploy, you must [set up a GPG key and configure Git to use it](https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work). This is used to sign your tags.

Then, if you have commit rights to the content repository, just run the deployment script:
65 66 67 68 69 70 71

```bash
./deploy
```

This will create a tag (you will have to enter a tag mesage when prompted, describing the release) and push it to production. Please make sure that you have already committed your changes and pushed them to production either via `git push live master` or by running the `./save` script, which does the same thing.

72
# Guide to Blockdown
73

74 75
Better content is authored in Blockdown.

76
Blockdown is Markdown with an extended high-level vocabulary for describing web malware for the Better knowledge base.
77

78
## Sites
79 80 81

Site pages have the following sections:

82
### Ethical design violations
83 84 85 86 87

```markdown
## Ethical design violations
```

88
This is a list of ethical design violations that gets converted to a collection of badges on the rendered site pages. The Trackers part of the list, detailed below, is updated automatically by [Better Inspector](https://source.ind.ie/better/inspector)
89 90 91

#### Trackers

92 93 94
The first badge is always the trackers badge. In Blockdown it is represented by a list item introduced by the word `Trackers`:

```markdown
95
  * (Trackers)
96 97 98 99 100
    * Automatically
    * Generated
    * List
    * of
    * Trackers
101 102 103 104
```

This gets automatically translated by [Better Builder](https://source.ind.ie/better/builder) to a badge similar to the one below:

Laura Kalbag's avatar
Laura Kalbag committed
105
![Screenshot of the trackers badge](images/readme/better/trackers-badge-example.png)
106 107 108

Tapping on the badge displays a popover with links to the actual trackers.

Laura Kalbag's avatar
Laura Kalbag committed
109
![Screenshot of the trackers popover](images/readme/better/trackers-popover-example.png)
110

111
The other badges are manually added if they apply to the site in question:
112 113 114 115

#### Aggressive

```markdown
116
* (Aggressive)
117 118
```

119 120
Attempts to block content blockers.

Laura Kalbag's avatar
Laura Kalbag committed
121
![Screenshot of the Aggressive Badge](images/readme/better/aggressive-badge-example.png)
122 123 124 125 126


#### Doorslam

```markdown
127
* (Doorslam)
128 129
```

130 131
Interrupts and blocks using modal dialogs.

Laura Kalbag's avatar
Laura Kalbag committed
132
![Screenshot of the Doorslam Badge](images/readme/better/doorslam-badge-example.png)
133 134 135 136 137


#### Clickbait

```markdown
138
* (Clickbait)
139 140
```

141 142
Uses exploitative, addictive content syndication network(s).

Laura Kalbag's avatar
Laura Kalbag committed
143
![Screenshot of the Clickbait Badge](images/readme/better/clickbait-badge-example.png)
144 145 146 147 148


#### Fingerprint

```markdown
149
* (Fingerprint)
150 151
```

152 153
Uses hidden Canvas fingerprinting.

Laura Kalbag's avatar
Laura Kalbag committed
154
![Screenshot of the Fingerprint Badge](images/readme/better/fingerprint-badge-example.png)
155 156


157
#### Web Bug
158 159

```markdown
160
* (Web bug)
161 162
```

163 164
Uses invisible tracking pixels.

Laura Kalbag's avatar
Laura Kalbag committed
165
![Screenshot of the Web Bugs Badge](images/readme/better/web-bugs-badge-example.png)
166 167

We might create new badges as and when we find new types of web malware and unethical practices to document and warn people about.
168 169 170 171 172 173 174

## After Better section

```markdown
## After Better
```

175
The After Better section provides statistics about the before (without the Better content blocker active) and after (with the Better content blocker active) performance of a site.
176

177
It is automatically generated by [Better Inspector](https://source.ind.ie/better/inspector)
178

Laura Kalbag's avatar
Laura Kalbag committed
179
![Screenshot of the After Better Section](images/readme/better/after-better.png)
180 181 182 183 184 185 186 187 188 189 190 191 192

## Block Rules section

This is the section where we enter the actual WebKit content blocking rules. Each rule is written in a strict subset of MSON (Markdown JSON) and has a brief explanation detailing what the rule does and why.

The blocking rules in this section serve the following purposes, in line with the [Ethical Design Manifesto](https://ind.ie/ethical-design)

  * Remove any first-party trackers (respect human rights)
  * Improve the usability of the site by removing first-party impediments like doorslams (respect human effort)
  * Improve the experience of the site (respect human effort) – we should especially aim to create a better experience after trackers have been removed (like removing empty spaces left over, etc.)

Please note that this is not the place to put blocking rules for trackers. Each tracker encountered should be entered into the [Trackers](#trackers) section and you have its own page in the `/trackers` section of the content.

193 194 195
### Blockdown syntax

Here is an example of a site-specific blocking rule in Blockdown format:
196 197 198 199 200 201 202 203 204 205

```markdown
```mson
- trigger:
  - url-filter: cdn.cultofmac.com/wp-content/plugins/com2014-ads/static/js/frontend-functionality.js
- action:
  - type: block
``` 
```

206 207
The Blockdown parser in Better supports all of the [WebKit content blocking rules](https://webkit.org/blog/3476/content-blockers-first-look/). Instead of JSON, however, we enter blocking rules in MSON. All Blockdown rules are combined by Better Builder into a single `blockerList.json` file.

208 209
Blockdown differs from plain WebKit content blocker rules in several ways to make authoring easier and to aid in readability:

210 211 212
1. The default load type is ‘third-party’.
2. The default action type is ‘block’.
2. The default is for rules to be case sensitive.
213

214
So, if we take the following fully-specified rule:
215 216 217 218

```markdown
```mson
- trigger:
219
  - url-filter: somedomain.ext
220 221 222
  - load-type: third-party
  - url-filter-is-case-sensitive: true
- action
223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241
  - type: block
``` 
```

We can simplify it naïvely by removing the properties that have defaults:

```markdown
```mson
- trigger:
  - url-filter: somedomain.ext
- action
``` 
```

Which leaves us with a valid rule but a sad-looking empty action section. In Blockdown neither the trigger nor action sections are required, so we can remove those also. This leaves us with:

```markdown
```mson
  - url-filter: somedomain.ext
242 243 244
``` 
```

245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270
But surely, we can do better than that. So we handle this special case in Blockdown by not requiring the url-filter key either:

```markdown
```mson
somedomain.ext
``` 
```

Ah, better! ;)

All of the above Blockdown rules are equivalent and will compile into the following fully-formed and highly specific WebKit content blocking rule in JSON:

```json
{
  "trigger": {
   "load-type": [
      "third-party"
    ],
    "url-filter-is-case-sensitive": true,
    "url-filter": "^[^:]+://+([^:/]+\\.)?somedomain\\.ext[:/]?"
  },
  "action": {
    "type": "block"
  }
}
```
271

272
## Automatic URL filter compilation
273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297

Blockdown automatically compiles simple `url-filter` properties to regular expressions with higher specificity as recommended in the [domain targeting recommendations by WebKit engineer Benjamin Poulain](https://webkit.org/blog/4062/targeting-domains-with-content-blockers/).

This means that you can author your entries in plain text, like this:

```markdown

  - url-filter: some-domain.ext

```

And Blockdown will compile them into the following form in the blockerList.json:

```json

  "url-filter": "^[^:]+://+([^:/]+\\.)?some-domain\\.ext[:/]?"

```

## Further reading on WebKit content blocking

  * [Introduction to WebKit Content Blockers](https://webkit.org/blog/3476/content-blockers-first-look/)
  * [Targeting Domains with Content Blockers](https://webkit.org/blog/4062/targeting-domains-with-content-blockers/)
  * [Official Safari content-blocking rules documentation from Apple](https://developer.apple.com/library/mac/documentation/Extensions/Conceptual/ContentBlockingRules/Introduction/Introduction.html)

298 299 300 301
# Investigation process

Currently, you need to have commit rights to the Content repository to use the Better commandline commands. However, you can use Git directly to fork the repository and submit merge requests and you can [add and edit pages through the online GitLab interface](https://source.ind.ie/better/content) without commit rights.

302 303
## Find who owns and runs the tracker

304 305 306 307 308 309 310
1. **Start by editing the tracker**

	```bash
	better/edit drafts/trackers/somedoma.in
	```

	This will create an issue in GitLab (or update an existing issue, if one already exists) and create or checkout a branch for you. It will also open your working copy of the tracker page in your system editor and in the browser.
311

312
2. **Enter the tracker URL into your browser in a private window to see if it loads.**
313

314
	Make sure you don’t have an VPNs or extensions blocking or making your browser behave differently from the norm. If you have any tracker blockers already enabled, it may make it harder to investigate!
315

316
3. **If it doesn’t load, or if you get a blank page, perform a whois.**
317

318
	We are currently using http://whois.domaintools.com for these so we can link to is as a source when stating ownership information. However, you will sometimes get more information from a direct whois look-up on your machine. In Terminal: `whois somedoma.in`
319

320
4. **Some trackers use a domain proxy or a cloaking service** (e.g., Domains by Proxy) to further hide their origins. In this case:
Aral Balkan's avatar
Aral Balkan committed
321 322
    * Open up the source of a site that the tracker originated on in the Web Developer console (Timeline view) of Safari (or in the web inspector of your browser of choice)
    * Try to recreate the original call. This might give you more clues about its origin. 
323 324 325 326 327 328 329 330

To find which sites a tracker is on, perform a search on the ~/better.fyi/drafts/trackers folder. For example, you can open up the folder in your text editor and do a global search for the tracker name.

You can also use [Better Inspector](https://source.ind.ie/better/inspector) to search for strings within requests. e.g., to find all URLs that contain *google.com*, run:

```bash
	./inquiry --local --find=google.com
```
331 332 333

Other useful tools:

334
* [Mozilla Lightbeam](https://www.mozilla.org/en-US/lightbeam/)
335

336 337
## Add the tracker/site name to the tracker markdown file

338 339
The name should be formatted as:

340 341 342 343
```markdown
**TrackerName** by Corporation (domain.tld)
```

344
If the tracker name is the same as the corporation name *(e.g. Adlucent by Adlucent)* then just keep the tracker name, and don’t incude the corporation name.
345 346
*When you edit a tracker markdown file for the first time, the domain.tld will already be in the title.*

347 348
## Add the site description

349
Add a concise one-line description of what the tracker, or the tracker owner, does.
350 351 352

*Usually the tracker sites have vague marketing spiel to describe themselves. Often a clearer description can be found in their privacy policy. If you can’t find a concise description in their own words, try to find their entry on [Wikipedia](https://wikipedia.org), Bloomberg or Crunchbase.*

353 354
Other useful tools:

Aral Balkan's avatar
Aral Balkan committed
355
* [Wikipedia](wikipedia.org)
356 357 358 359 360 361 362 363 364 365 366 367

## Include references in Notes

* Whether it’s the domain whois, or where you found the site description, include a link back to every source in the Notes section.
* Include a link to the tracker/corporation Privacy Policy (if it exists!)
* If you end up looking through the source file to find more information, you can include relevant code snippets in markdown.

*You can use sub-lists in Notes by using indented lists in markdown.*
*[See the Demandbase tracker for an varied use of Notes](https://better.fyi/trackers/company-target.com/#notes)*

# Handling duplicate trackers

368
Loads of trackers have multiple domains for the same tracker, or group of trackers. In this case, we don’t want duplicate entries that don’t stay in sync.
369

370
1. The first tracker found and investigated is the canonical tracker.
371

372 373 374 375 376
2. Any further trackers with the same name/owner should link to the canonical tracker in place of the description. *Example from [addthisedge.com tracker](https://better.fyi/trackers/addthisedge.com/):*

	```markdown
	> See [addthis.com](/trackers/addthis.com/).
	```
377

378 379 380 381 382
3. The Ethical Design Violations are still necessary, as the type of violation might vary between the domains.

4. The Block Rule is still necessary, as it blocks this specific domain.

5. The only Notes necessary is the source for the domain origin. Any other notes can be added to the canonical tracker.
383

384 385 386
Read more in the [Better Content styleguide](styleguide.md).

[Discussions on the Investigations processes can be found on the Ind.ie forum](https://forum.ind.ie/c/better/investigations).