Past Evaluations

November 16-17th, 2017

So back in the day Bruno had made a large CSV file of open-source Flask repositories, 547 unique ones to be exact, 66 had problems downloading (maybe removed between now and then) so I cloned 481 unique open-source repos. [1]

I decided, to stay focused during the hackathon, and complete my goal in time, that I would only look for open-redirects. Open-redirects are when you can send somebody a link like https://chase.com?next=http://evil.com or https://chase.com?next=%68%74%74%70%73%3a%2f%2f%77%77%77%2e%65%76%69%6c%2e%63%6f%6d and evil.com has a phishing page on it. The reason I chose open-redirects as my one bug class is because in my experience, there are less secondary nodes involved. e.g.

@app.route('/login/authorized')
@facebook.authorized_handler
def facebook_authorized(resp):
    next_url = request.args.get('next') or url_for('new_paste')
    if resp is None:
        flash('You denied the login')
        return redirect(next_url)
    # ...

By ‘secondary nodes’, I mean the arg typically is retrieved and then placed in redirect(), it isn’t passed into another function, or used in an assignment somewhere else. This makes them somewhat easier to find statically, as there are less things that can go wrong.

In order to evaluate the tool, I needed to know which repositories actually had vulnerabilities in them, so I put my junk hacking skills to work and made 2 regular expressions. ".*redirect\([a-zA-Z0-9_]+\).*" and ".*redirect\(request.*\).*". The first one detects any redirect call with a variable as the only argument, and the second detects any redirect call with request.anything as the first argument. These aren’t perfect, but have a good chance of detecting 90 something percent of possibly vulnerable calls. (The CSV file somehow had a controller file associated with each repo, and each regex was run on every line.)

This narrowed it down to 20 repositories, around ~4.1 percent. 17 repos matched the first regex, and 3 matched the second. [2]

The great: 4 true positives were reported, although 3 of them are in @twitter.authorized_handler or @facebook.authorized_handler controllers (used in OAuth.) The only notable one is noted first, as it is in code written by the creator of Flask.

The good: 7 didn’t have any vulnerabilities, and we didn’t report any.

The bad: 4 had no real vulnerabilities, but had one or two false positives.

  • billyfung/flask_shortener 2 false positives (Lazy mistake of making .get a source instead of request.args.get etc., so redis.get was used as a source.)
  • ZAGJAB/Flask_OAuth2 1 false positive (Customization is needed for open-redirects, to eliminate all false-positives, because if something tainted is used in string formatting, it typically needs to be at the very beginning of the string to be a vulnerability.)
  • amehta/Flaskly 1 false positive (GitHub issue incoming.)
  • mskog/cheapskate 1 false positive (GitHub issue incoming.)

The ugly: 2 had one false-negative.

The fatal: 3 crashed PyT, but by sheer luck, didn’t have any open-redirect vulnerabilities in them, I looked. 2 or 3 of them were caused by a PR I merged last weekend.

In closing, these results seem okay to me, because once the GitHub issues are made for these issues and fixed we’ll be in a great place. This evaluation certainly didn’t find all the bugs, but probably most of them. The next time I evaluate the tool, I think I’ll look for command injection vulnerabilities or SSRF. I think the only takeaway is that for evaluating a tool like this you should use regexes to narrow down what you spend your time analyzing. I’ll make a PR in the PyT repo to show what I customized to just find open-redirects, but it was mostly trimming down the sink list to just redirect and removing form from the source list.

Around or before May 2016

During the writing of the original thesis PyT was run on 6 or 7 open-source projects and no vulnerabilities were found. I think the sample size was too small.

[1]There are some commits in the CSV file that removed a few repos that caused issues before, but those commits were made long ago, just wanted to note the stats aren’t 100% impartial.
[2]This isn’t exactly true, if something matched the 2nd regex I removed ".*redirect\(request\.url\).*" and ".*redirect\(request\.referrer\).*".