readme.md 15.8 KB
Newer Older
Aral Balkan's avatar
Aral Balkan committed
1
# Better
Aral Balkan's avatar
Aral Balkan committed
2

Aral Balkan's avatar
Aral Balkan committed
3
Better protects you from unethical web sites. It makes your web experience safer, lighter, and faster.
Aral Balkan's avatar
Aral Balkan committed
4

Aral Balkan's avatar
Aral Balkan committed
5
Better enforces the [Ethical Design Manifesto](https://ind.ie/ethical-design). It helps the Web respect human rights, effort, and experience.
Aral Balkan's avatar
Aral Balkan committed
6

Aral Balkan's avatar
Aral Balkan committed
7
Better is curated by Ind.ie, a social enterprise that defends human rights. It’s free, open, and transparent.
Aral Balkan's avatar
Aral Balkan committed
8

Aral Balkan's avatar
Aral Balkan committed
9
## Content
Aral Balkan's avatar
Aral Balkan committed
10

Aral Balkan's avatar
Aral Balkan committed
11
This repository contains the Better content: Better’s database of information on trackers and other malware as well as the web sites that host them.
Aral Balkan's avatar
Aral Balkan committed
12

Aral Balkan's avatar
Aral Balkan committed
13
This content is in Blockdown format. Blockdown is an extension of Markdown with special vocabulary to describe web malware. Blockdown can also contain WebKit content blocking rules. The Blockdown pages in Better’s content repository both describe web malware and contain the rules to block them.
Aral Balkan's avatar
Aral Balkan committed
14

Aral Balkan's avatar
Aral Balkan committed
15
This content is processed by [Better Builder](https://source.ind.ie/better/builder) to generate the [Better web site](https://better.fyi) as well as the data for the [Better iOS App](https://source.ind.ie/better/app), including a WebKit `blockerList.json` file.
Aral Balkan's avatar
Aral Balkan committed
16

Aral Balkan's avatar
Aral Balkan committed
17
A seminal advantage of Better is that its database is human-readable, open, and extensible via pull requests. (The database is curated by Ind.ie using the Ethical Design Manifesto as the criteria for blocking rules.)
Aral Balkan's avatar
Aral Balkan committed
18

19
Contributing to the content is as easy as creating an account on [source.ind.ie](https://source.ind.ie) and editing a content page in your browser.
20

Aral Balkan's avatar
Aral Balkan committed
21
## I’m not a developer, I just want to experience a Better web.
22

23
[Get Better from the App Store.](https://itunes.apple.com/us/app/better-by-ind.ie/id1080964978?mt=8)
24

Aral Balkan's avatar
Aral Balkan committed
25 26
## How can I support Better?

Laura Kalbag's avatar
Laura Kalbag committed
27
Buying [Better on the App Store](https://itunes.apple.com/us/app/better-by-ind.ie/id1080964978?mt=8) is one way to support us. If you want to help with the ongoing costs of developing and maintaining Better, you can [donate to Ind.ie](https://ind.ie/fund/) or, even better, [become a patron](https://ind.ie/fund/) by setting up a recurring donation.
Aral Balkan's avatar
Aral Balkan committed
28 29 30

## I’m a developer, let me in!

31 32
The easiest way to get started is to follow the instructions in the readme for the [Better iOS app](https://source.ind.ie/better/app) repository.

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
## Testing locally.

[Better Builder](https://source.ind.ie/better/builder) will automatically pick up your changes as you save and rebuild your local data.

To persist your changes locally, commit them in Git and push to origin:

```bash
git commit -am "My awesome content update"
git push origin master
```

Note that these changes will be destroyed if you run the Better Builder installer (or the Better iOS installer, which runs the Better Builder installer as part of its own installation process). To not lose any work, save your changes regularly by pushing to production, as explained below.

## Saving your changes by pushing to production

You can push to production with:

```bash
./save
```

Or, manually run what the save script does, which is:

```bash
git push live master
```

## Deployment

If you have commit rights to the content repository, just run the deployment script:

```bash
./deploy
```

This will create a tag (you will have to enter a tag mesage when prompted, describing the release) and push it to production. Please make sure that you have already committed your changes and pushed them to production either via `git push live master` or by running the `./save` script, which does the same thing.

70
# Guide to Blockdown
71

72 73
Better content is authored in Blockdown.

74
Blockdown is Markdown with an extended high-level vocabulary for describing web malware for the Better knowledge base.
75

76
## Sites
77 78 79

Site pages have the following sections:

80
### Ethical design violations
81 82 83 84 85

```markdown
## Ethical design violations
```

86
This is a list of ethical design violations that gets converted to a collection of badges on the rendered site pages. The Trackers part of the list, detailed below, is updated automatically by [Better Inspector](https://source.ind.ie/better/inspector)
87 88 89

#### Trackers

90 91 92
The first badge is always the trackers badge. In Blockdown it is represented by a list item introduced by the word `Trackers`:

```markdown
93 94 95 96 97 98
  * Trackers
    * Automatically
    * Generated
    * List
    * of
    * Trackers
99 100 101 102
```

This gets automatically translated by [Better Builder](https://source.ind.ie/better/builder) to a badge similar to the one below:

Laura Kalbag's avatar
Laura Kalbag committed
103
![Screenshot of the trackers badge](images/readme/better/trackers-badge-example.png)
104 105 106

Tapping on the badge displays a popover with links to the actual trackers.

Laura Kalbag's avatar
Laura Kalbag committed
107
![Screenshot of the trackers popover](images/readme/better/trackers-popover-example.png)
108

109
The other badges are manually added if they apply to the site in question:
110 111 112 113

#### Aggressive

```markdown
114
* Aggressive
115 116
```

117 118
Attempts to block content blockers.

Laura Kalbag's avatar
Laura Kalbag committed
119
![Screenshot of the Aggressive Badge](images/readme/better/aggressive-badge-example.png)
120 121 122 123 124


#### Doorslam

```markdown
125
* Doorslam
126 127
```

128 129
Interrupts and blocks using modal dialogs.

Laura Kalbag's avatar
Laura Kalbag committed
130
![Screenshot of the Doorslam Badge](images/readme/better/doorslam-badge-example.png)
131 132 133 134 135


#### Clickbait

```markdown
136
* Clickbait
137 138
```

139 140
Uses exploitative, addictive content syndication network(s).

Laura Kalbag's avatar
Laura Kalbag committed
141
![Screenshot of the Clickbait Badge](images/readme/better/clickbait-badge-example.png)
142 143 144 145 146


#### Fingerprint

```markdown
147
* Fingerprint
148 149
```

150 151
Uses hidden Canvas fingerprinting.

Laura Kalbag's avatar
Laura Kalbag committed
152
![Screenshot of the Fingerprint Badge](images/readme/better/fingerprint-badge-example.png)
153 154


155
#### Web Bug
156 157

```markdown
158
* Web bug
159 160
```

161 162
Uses invisible tracking pixels.

Laura Kalbag's avatar
Laura Kalbag committed
163
![Screenshot of the Web Bugs Badge](images/readme/better/web-bugs-badge-example.png)
164 165

We might create new badges as and when we find new types of web malware and unethical practices to document and warn people about.
166 167 168 169 170 171 172

## After Better section

```markdown
## After Better
```

173
The After Better section provides statistics about the before (without the Better content blocker active) and after (with the Better content blocker active) performance of a site.
174

175
It is automatically generated by [Better Inspector](https://source.ind.ie/better/inspector)
176

Laura Kalbag's avatar
Laura Kalbag committed
177
![Screenshot of the After Better Section](images/readme/better/after-better.png)
178 179 180 181 182 183 184 185 186 187 188 189 190

## Block Rules section

This is the section where we enter the actual WebKit content blocking rules. Each rule is written in a strict subset of MSON (Markdown JSON) and has a brief explanation detailing what the rule does and why.

The blocking rules in this section serve the following purposes, in line with the [Ethical Design Manifesto](https://ind.ie/ethical-design)

  * Remove any first-party trackers (respect human rights)
  * Improve the usability of the site by removing first-party impediments like doorslams (respect human effort)
  * Improve the experience of the site (respect human effort) – we should especially aim to create a better experience after trackers have been removed (like removing empty spaces left over, etc.)

Please note that this is not the place to put blocking rules for trackers. Each tracker encountered should be entered into the [Trackers](#trackers) section and you have its own page in the `/trackers` section of the content.

191 192 193
### Blockdown syntax

Here is an example of a site-specific blocking rule in Blockdown format:
194 195 196 197 198 199 200 201 202 203

```markdown
```mson
- trigger:
  - url-filter: cdn.cultofmac.com/wp-content/plugins/com2014-ads/static/js/frontend-functionality.js
- action:
  - type: block
``` 
```

204 205
The Blockdown parser in Better supports all of the [WebKit content blocking rules](https://webkit.org/blog/3476/content-blockers-first-look/). Instead of JSON, however, we enter blocking rules in MSON. All Blockdown rules are combined by Better Builder into a single `blockerList.json` file.

206 207
Blockdown differs from plain WebKit content blocker rules in several ways to make authoring easier and to aid in readability:

208 209 210
1. The default load type is ‘third-party’.
2. The default action type is ‘block’.
2. The default is for rules to be case sensitive.
211

212
So, if we take the following fully-specified rule:
213 214 215 216

```markdown
```mson
- trigger:
217
  - url-filter: somedomain.ext
218 219 220
  - load-type: third-party
  - url-filter-is-case-sensitive: true
- action
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239
  - type: block
``` 
```

We can simplify it naïvely by removing the properties that have defaults:

```markdown
```mson
- trigger:
  - url-filter: somedomain.ext
- action
``` 
```

Which leaves us with a valid rule but a sad-looking empty action section. In Blockdown neither the trigger nor action sections are required, so we can remove those also. This leaves us with:

```markdown
```mson
  - url-filter: somedomain.ext
240 241 242
``` 
```

243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268
But surely, we can do better than that. So we handle this special case in Blockdown by not requiring the url-filter key either:

```markdown
```mson
somedomain.ext
``` 
```

Ah, better! ;)

All of the above Blockdown rules are equivalent and will compile into the following fully-formed and highly specific WebKit content blocking rule in JSON:

```json
{
  "trigger": {
   "load-type": [
      "third-party"
    ],
    "url-filter-is-case-sensitive": true,
    "url-filter": "^[^:]+://+([^:/]+\\.)?somedomain\\.ext[:/]?"
  },
  "action": {
    "type": "block"
  }
}
```
269

270
## Automatic URL filter compilation
271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295

Blockdown automatically compiles simple `url-filter` properties to regular expressions with higher specificity as recommended in the [domain targeting recommendations by WebKit engineer Benjamin Poulain](https://webkit.org/blog/4062/targeting-domains-with-content-blockers/).

This means that you can author your entries in plain text, like this:

```markdown

  - url-filter: some-domain.ext

```

And Blockdown will compile them into the following form in the blockerList.json:

```json

  "url-filter": "^[^:]+://+([^:/]+\\.)?some-domain\\.ext[:/]?"

```

## Further reading on WebKit content blocking

  * [Introduction to WebKit Content Blockers](https://webkit.org/blog/3476/content-blockers-first-look/)
  * [Targeting Domains with Content Blockers](https://webkit.org/blog/4062/targeting-domains-with-content-blockers/)
  * [Official Safari content-blocking rules documentation from Apple](https://developer.apple.com/library/mac/documentation/Extensions/Conceptual/ContentBlockingRules/Introduction/Introduction.html)

296 297 298 299
# Investigation process

Currently, you need to have commit rights to the Content repository to use the Better commandline commands. However, you can use Git directly to fork the repository and submit merge requests and you can [add and edit pages through the online GitLab interface](https://source.ind.ie/better/content) without commit rights.

300 301
## Find who owns and runs the tracker

302 303 304 305 306 307 308
1. **Start by editing the tracker**

	```bash
	better/edit drafts/trackers/somedoma.in
	```

	This will create an issue in GitLab (or update an existing issue, if one already exists) and create or checkout a branch for you. It will also open your working copy of the tracker page in your system editor and in the browser.
309

310
2. **Enter the tracker URL into your browser in a private window to see if it loads.**
311

312
	Make sure you don’t have an VPNs or extensions blocking or making your browser behave differently from the norm. If you have any tracker blockers already enabled, it may make it harder to investigate!
313

314
3. **If it doesn’t load, or if you get a blank page, perform a whois.**
315

316
	We are currently using http://whois.domaintools.com for these so we can link to is as a source when stating ownership information. However, you will sometimes get more information from a direct whois look-up on your machine. In Terminal: `whois somedoma.in`
317

318
4. **Some trackers use a domain proxy or a cloaking service** (e.g., Domains by Proxy) to further hide their origins. In this case:
319

320
	* Open up the source of a site that the tracker originated on in the Web Developer console (Timeline view) of Safari (or in the web inspector of your browser of choice)
321

322 323 324 325 326 327 328 329 330
	* Try to recreate the original call. This might give you more clues about its origin. 

To find which sites a tracker is on, perform a search on the ~/better.fyi/drafts/trackers folder. For example, you can open up the folder in your text editor and do a global search for the tracker name.

You can also use [Better Inspector](https://source.ind.ie/better/inspector) to search for strings within requests. e.g., to find all URLs that contain *google.com*, run:

```bash
	./inquiry --local --find=google.com
```
331 332 333

Other useful tools:

334
* [Mozilla Lightbeam](https://www.mozilla.org/en-US/lightbeam/)
335

336 337
## Add the tracker/site name to the tracker markdown file

338 339
The name should be formatted as:

340 341 342 343
```markdown
**TrackerName** by Corporation (domain.tld)
```

344
If the tracker name is the same as the corporation name *(e.g. Adlucent by Adlucent)* then just keep the tracker name, and don’t incude the corporation name.
345 346
*When you edit a tracker markdown file for the first time, the domain.tld will already be in the title.*

347 348
## Add the site description

349
Add a concise one-line description of what the tracker, or the tracker owner, does.
350 351 352

*Usually the tracker sites have vague marketing spiel to describe themselves. Often a clearer description can be found in their privacy policy. If you can’t find a concise description in their own words, try to find their entry on [Wikipedia](https://wikipedia.org), Bloomberg or Crunchbase.*

353 354 355 356 357 358 359 360 361 362 363 364 365 366 367
Other useful tools:

  * [Wikipedia](wikipedia.org)

## Include references in Notes

* Whether it’s the domain whois, or where you found the site description, include a link back to every source in the Notes section.
* Include a link to the tracker/corporation Privacy Policy (if it exists!)
* If you end up looking through the source file to find more information, you can include relevant code snippets in markdown.

*You can use sub-lists in Notes by using indented lists in markdown.*
*[See the Demandbase tracker for an varied use of Notes](https://better.fyi/trackers/company-target.com/#notes)*

# Handling duplicate trackers

368
Loads of trackers have multiple domains for the same tracker, or group of trackers. In this case, we don’t want duplicate entries that don’t stay in sync.
369

370
1. The first tracker found and investigated is the canonical tracker.
371

372 373 374 375 376
2. Any further trackers with the same name/owner should link to the canonical tracker in place of the description. *Example from [addthisedge.com tracker](https://better.fyi/trackers/addthisedge.com/):*

	```markdown
	> See [addthis.com](/trackers/addthis.com/).
	```
377

378 379 380 381 382
3. The Ethical Design Violations are still necessary, as the type of violation might vary between the domains.

4. The Block Rule is still necessary, as it blocks this specific domain.

5. The only Notes necessary is the source for the domain origin. Any other notes can be added to the canonical tracker.
383

384 385
# Content authoring guidelines

386 387 388 389 390
	* Be brief: do not quote the whole privacy policy; pick out interesting bits.

	* You can editorialise (with restraint). Sometimes you just have to laugh at the ridiculousness of some of the trackers that we’re covering. It also helps, when trudging through the cesspit of surveillance capitalism to retain our humour. And it also makes the pages more interesting to read (we don’t want to create a dry database). Please only add editorial comments for something unusually important or to highlight egregious abuses. A good rule of thumb would be: “would this make a good slide in a presentation to illustrate the problem with this particular thing or practice?” Editorial comments should be brief, marked with ‘– Ed.’ and limited to at most one per tracker.

	* Use images (sparingly). Not every humdrum tracker page needs images. However, if you are making an editorial comment and you feel that a visual aid is important in highlighting the point, please feel free to use images. Images and screenshots should be 1,160px wide (to display well at their 580pt width on high resolution screens). Please resize and compress images properly. On Mac, [ImageOptim](https://imageoptim.com/mac) is a great application for compressing PNGs and [PhotoBulk](https://itunes.apple.com/us/app/photobulk-watermark-resize/id537211143?mt=12) is a convenient app for converting between formats.