Commit f9bdb331 authored by Aral Balkan's avatar Aral Balkan
Browse files

Cleaned up the readme and documented Blockdown updates.

parent 1a0dc83d
......@@ -201,33 +201,91 @@ Here is an example of a site-specific blocking rule in MSON format:
The Blockdown parser in Better supports all of the [WebKit content blocking rules](https://webkit.org/blog/3476/content-blockers-first-look/). Instead of JSON, however, we enter blocking rules in MSON. All Blockdown rules are combined by Better Builder into a single `blockerList.json` file.
Blockdown differs from plain WebKit content blocker rules in several ways to make authoring easier and to aid in readability:
1. The default load type in Blockdown is third-party.
2. The default for rules to be case sensitive.
So, if you do not specify a `load-type` or `url-filter-is-case-sensitive` properties in your rules, they will behave as if you had specified:
```markdown
```mson
- trigger:
- url-filter: …
- load-type: third-party
- url-filter-is-case-sensitive: true
- action
- …
``` 
```
You may, of course, override these by explicitly specifying those properties in your rules.
## Automatic URL filter compilation
Blockdown automatically compiles simple `url-filter` properties to regular expressions with higher specificity as recommended in the [domain targeting recommendations by WebKit engineer Benjamin Poulain](https://webkit.org/blog/4062/targeting-domains-with-content-blockers/).
This means that you can author your entries in plain text, like this:
```markdown
- url-filter: some-domain.ext
```
And Blockdown will compile them into the following form in the blockerList.json:
```json
"url-filter": "^[^:]+://+([^:/]+\\.)?some-domain\\.ext[:/]?"
```
## Further reading on WebKit content blocking
* [Introduction to WebKit Content Blockers](https://webkit.org/blog/3476/content-blockers-first-look/)
* [Targeting Domains with Content Blockers](https://webkit.org/blog/4062/targeting-domains-with-content-blockers/)
* [Official Safari content-blocking rules documentation from Apple](https://developer.apple.com/library/mac/documentation/Extensions/Conceptual/ContentBlockingRules/Introduction/Introduction.html)
# Investigation process
Currently, you need to have commit rights to the Content repository to use the Better commandline commands. However, you can use Git directly to fork the repository and submit merge requests and you can [add and edit pages through the online GitLab interface](https://source.ind.ie/better/content) without commit rights.
## Find who owns and runs the tracker
* 1. Start by editing the tracker: `better/edit drafts/trackers/somedoma.in`.
1. **Start by editing the tracker**
```bash
better/edit drafts/trackers/somedoma.in
```
This will create an issue in GitLab (or update an existing issue, if one already exists) and create or checkout a branch for you. It will also open your working copy of the tracker page in your system editor and in the browser.
*This will create an issue in GitLab (or update an existing issue, if one already exists) and create or checkout a branch for you. It will also open your working copy of the tracker page in your system editor and in the browser.*
2. **Enter the tracker URL into your browser in a private window to see if it loads.**
* 2. First, enter the domain into your browser in a private window to see if it loads.
Make sure you don’t have an VPNs or extensions blocking or making your browser behave differently from the norm. If you have any tracker blockers already enabled, it may make it harder to investigate!
*Make sure you don’t have an VPNs or extensions blocking or making your browser behave differently from the norm. If you have any tracker blockers already enabled, it may make it harder to investigate!*
3. **If it doesn’t load, or if you get a blank page, perform a whois.**
* 3. If it doesn’t load, or if you get a blank page, perform a whois.
We are currently using http://whois.domaintools.com for these so we can link to is as a source when stating ownership information. However, you will sometimes get more information from a direct whois look-up on your machine. In Terminal: `whois somedoma.in`
*We are currently using http://whois.domaintools.com for these so we can link to is as a source when stating ownership information. However, you will sometimes get more information from a direct whois look-up on your machine. In Terminal: `whois somedoma.in`*
4. **Some trackers use a domain proxy or a cloaking service** (e.g., Domains by Proxy) to further hide their origins. In this case:
* 4. Some trackers use a domain proxy or a cloaking service (e.g., Domains by Proxy) to further hide their origins. In this case:
* Open up the source of a site that the tracker originated on in the Web Developer console (Timeline view) of Safari (or in the web inspector of your browser of choice)
* Try to recreate the original call. This might give you more clues about its origin.
* Open up the source of a site that the tracker originated on in the Web Developer console (Timeline view) of Safari (or in the web inspector of your browser of choice)
*(To find which sites a tracker is on, perform a search on the ~/better.fyi/drafts/trackers folder. For example, you can open up the folder in your text editor and do a global search for the tracker name.)*
* Try to recreate the original call. This might give you more clues about its origin.
To find which sites a tracker is on, perform a search on the ~/better.fyi/drafts/trackers folder. For example, you can open up the folder in your text editor and do a global search for the tracker name.
You can also use [Better Inspector](https://source.ind.ie/better/inspector) to search for strings within requests. e.g., to find all URLs that contain *google.com*, run:
```bash
./inquiry --local --find=google.com
```
Other useful tools:
* [Mozilla Lightbeam](https://www.mozilla.org/en-US/lightbeam/)
* [Mozilla Lightbeam](https://www.mozilla.org/en-US/lightbeam/)
## Add the tracker/site name to the tracker markdown file
......@@ -261,21 +319,26 @@ Other useful tools:
# Handling duplicate trackers
* Loads of trackers have multiple domains for the same tracker, or group of trackers. In this case, we don’t want duplicate entries that don’t stay in sync.
Loads of trackers have multiple domains for the same tracker, or group of trackers. In this case, we don’t want duplicate entries that don’t stay in sync.
* 1. The first tracker found and investigated is the canonical tracker.
* 2. Any further trackers with the same name/owner should link to the canonical tracker in place of the description. *Example from [addthisedge.com tracker](https://better.fyi/trackers/addthisedge.com/):*
1. The first tracker found and investigated is the canonical tracker.
```markdown
> See [addthis.com](/trackers/addthis.com/).
```
2. Any further trackers with the same name/owner should link to the canonical tracker in place of the description. *Example from [addthisedge.com tracker](https://better.fyi/trackers/addthisedge.com/):*
```markdown
> See [addthis.com](/trackers/addthis.com/).
```
* 3. The Ethical Design Violations are still necessary, as the type of violation might vary between the domains.
* 4. The Block Rule is still necessary, as it blocks this specific domain.
* 5. The only Notes necessary is the source for the domain origin. Any other notes can be added to the canonical tracker.
3. The Ethical Design Violations are still necessary, as the type of violation might vary between the domains.
4. The Block Rule is still necessary, as it blocks this specific domain.
5. The only Notes necessary is the source for the domain origin. Any other notes can be added to the canonical tracker.
# Content authoring guidelines
* Be brief: do not quote the whole privacy policy; pick out interesting bits.
* You can editorialise (with restraint). Sometimes you just have to laugh at the ridiculousness of some of the trackers that we’re covering. It also helps, when trudging through the cesspit of surveillance capitalism to retain our humour. And it also makes the pages more interesting to read (we don’t want to create a dry database). Please only add editorial comments for something unusually important or to highlight egregious abuses. A good rule of thumb would be: “would this make a good slide in a presentation to illustrate the problem with this particular thing or practice?” Editorial comments should be brief, marked with ‘– Ed.’ and limited to at most one per tracker.
* Use images (sparingly). Not every humdrum tracker page needs images. However, if you are making an editorial comment and you feel that a visual aid is important in highlighting the point, please feel free to use images. Images and screenshots should be 1,160px wide (to display well at their 580pt width on high resolution screens). Please resize and compress images properly. On Mac, [ImageOptim](https://imageoptim.com/mac) is a great application for compressing PNGs and [PhotoBulk](https://itunes.apple.com/us/app/photobulk-watermark-resize/id537211143?mt=12) is a convenient app for converting between formats.
* Be brief: do not quote the whole privacy policy; pick out interesting bits.
* You can editorialise (with restraint). Sometimes you just have to laugh at the ridiculousness of some of the trackers that we’re covering. It also helps, when trudging through the cesspit of surveillance capitalism to retain our humour. And it also makes the pages more interesting to read (we don’t want to create a dry database). Please only add editorial comments for something unusually important or to highlight egregious abuses. A good rule of thumb would be: “would this make a good slide in a presentation to illustrate the problem with this particular thing or practice?” Editorial comments should be brief, marked with ‘– Ed.’ and limited to at most one per tracker.
* Use images (sparingly). Not every humdrum tracker page needs images. However, if you are making an editorial comment and you feel that a visual aid is important in highlighting the point, please feel free to use images. Images and screenshots should be 1,160px wide (to display well at their 580pt width on high resolution screens). Please resize and compress images properly. On Mac, [ImageOptim](https://imageoptim.com/mac) is a great application for compressing PNGs and [PhotoBulk](https://itunes.apple.com/us/app/photobulk-watermark-resize/id537211143?mt=12) is a convenient app for converting between formats.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment