0

I have the following code which I am using to login to a website programmatically. However, instead of returning the logged in page's html (with user data info), it returns the html for the login page. I have tried to find what's going wrong multiple times but I can't seem to find it.

 public class LauncherClass {

static String username = "----username here------"; //blocked out here for obvious reasons
static String password = "----password here------";
static String loginUrl = "https://parents.mtsd.k12.nj.us/genesis/parents/j_security_check";
static String userDataUrl = "https://parents.mtsd.k12.nj.us/genesis/parents?module=gradebook";

public static void main(String[] args) throws IOException{

LauncherClass launcher = new LauncherClass();
launcher.Login(loginUrl, username, password);

}

public void Login(String url, String username, String password) throws IOException {

    Connection.Response res = Jsoup
            .connect(url)
            .data("j_username",username,"j_password",password)
            .followRedirects(true)
            .ignoreHttpErrors(true)
            .method(Method.POST)
            .userAgent("Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.4 Safari/537.36")
            .timeout(500)
            .execute();

    Map <String,String> cookies = res.cookies();

    Document loggedIn = Jsoup.connect(userDataUrl)
            .cookies(cookies)
            .get();

    System.out.print(loggedIn);

    }
}

[NOTE] The login form does have a line:

 <input type="submit" class="saveButton" value="Login">

but this does not have a "name" attribute so I did not post it

Any answers/comments are appreciated!

[UPDATE2] For the login page, browser displays the following...

 ---General
    Remote Address:107.0.42.212:443
    Request URL:https://parents.mtsd.k12.nj.us/genesis/j_security_check
    Request Method:POST
    Status Code:302 Found
----Response Headers
    view source
    Content-Length:0
    Date:Sun, 26 Jul 2015 20:06:15 GMT
    Location:https://parents.mtsd.k12.nj.us/genesis/parents?gohome=true
    Server:Apache-Coyote/1.1
----Request Headers
    view source   
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Accept-Encoding:gzip, deflate
    Accept-Language:en-US,en;q=0.8
    Cache-Control:max-age=0
    Connection:keep-alive
    Content-Length:51
    Content-Type:application/x-www-form-urlencoded
    Cookie:JSESSIONID=33C445158EB6CCAFFF77D2873FD66BC0;         lastvisit=458D80553DC34ADD8DB232B5A8FC99CA
    Host:parents.mtsd.k12.nj.us
    HTTPS:1
    Origin:https://parents.mtsd.k12.nj.us
    Referer:https://parents.mtsd.k12.nj.us/genesis/parents?gohome=true
    User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4)                 AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.4 Safari/537.36
----Form Data
    j_username: ---username here---
    j_password: ---password here---        
Dave Newton
  • 156,572
  • 25
  • 250
  • 300
mlz7
  • 1,903
  • 3
  • 23
  • 49
  • Take a look at this http://stackoverflow.com/questions/31549799/using-jsoup-to-login-to-coned-website/31570494#31570494 – Alkis Kalogeris Jul 26 '15 at 19:56
  • Moreover try setting the userAgent – Alkis Kalogeris Jul 26 '15 at 19:57
  • @alkis I took your advice but still no luck... – mlz7 Jul 26 '15 at 20:07
  • Check the request params in your browser, make sure it isn't using any hidden parameters and/or JavaScript, etc. – Dave Newton Jul 26 '15 at 20:29
  • @DaveNewton yup already did, no affect – mlz7 Jul 26 '15 at 20:32
  • It'll be tough to help without any further info. I'd suggest posting the request body made from the browser. – Dave Newton Jul 26 '15 at 20:37
  • @DaveNewton does that help? or should I post the data for the login page as well? – mlz7 Jul 26 '15 at 20:44
  • I meant the post to login, with your credentials redacted, of course. – Dave Newton Jul 26 '15 at 20:56
  • @DaveNewton alright there sorry about that – mlz7 Jul 26 '15 at 22:30
  • Anything missing from your JSoup request/response? Does JSoup do sessions automatically or do you have to configure it? – Dave Newton Jul 26 '15 at 22:35
  • @DaveNewton JSoup does sessions automatically and as far as I can tell, my request/response should be fine – mlz7 Jul 27 '15 at 01:05
  • Again, without more info it'll be almost impossible to help--we can't see what's happening after your Android post, with your account info, what else might be going on, etc. Your best bet is to continue to analyze differences in browser vs. JSoup behavior, examine the requests and responses closely, blah blah blah. – Dave Newton Jul 27 '15 at 01:49
  • @Dave Newton what other info do you need? I've been analyzing the network vs my code and I see no discrepancies – mlz7 Jul 27 '15 at 01:52
  • Well, I'm guessing there's a difference, otherwise it'd work. I'm not sure what else you could post here except the request and response from your Android client, changing all Android stuff to be precisely like the browser to help figure out what's happening, etc. – Dave Newton Jul 27 '15 at 02:00
  • @Dave Newton well my codes here so – mlz7 Jul 27 '15 at 02:03
  • The code isn't the only thing that's going on: the request, the response are important to help understand what's happening when you're interacting with an external system. This is one of those things that it's difficult to debug without sitting in front of the machine and interacting with the system and having access to the same information you have available. – Dave Newton Jul 27 '15 at 02:09
  • @Dave Newton how would I get the response my code yields? the output is just the login forms html, and my request should be providing all necessary info – mlz7 Jul 27 '15 at 02:12
  • Try setting the referrer – Alkis Kalogeris Jul 27 '15 at 03:54
  • @alkis I did and it made no difference – mlz7 Jul 27 '15 at 04:37

1 Answers1

2

You have to login to the site in two stages.
STAGE 1 - You send a GET request to this URL - https://parents.mtsd.k12.nj.us/genesis/parents?gohome=true and you get the session cookies.
STAGE 2 -
You send a post request with your username and password, and add the cookies you got on stage 1.
The code for that is -

Connection.Response res = null;
Document doc = null;

try {   //first connection with GET request
        res = Jsoup.connect("https://parents.mtsd.k12.nj.us/genesis/parents?gohome=true")
//                  .userAgent(YourUserAgent)
//                  .header("Accept", WhateverTheSiteSends)
//                  .timeout(Utilities.timeout)
                    .method(Method.GET)
                    .execute();         
    } catch (Exception ex) {
        //Do some exception handling here
    }
try {
        doc = Jsoup.connect("https://parents.mtsd.k12.nj.us/genesis/parents/j_security_check"")
    //          .userAgent(YourUserAgent)
    //          .referrer(Referer)
    //          .header("Content-Type", ...)
                .cookies(res.cookies())
                .data("j_username",username)
                .data("j_password",password)                    
                .post();
    } catch (Exception ex) {
        //Do some exception handling here
    }
    //Now you can use doc!

You may have to add for both requests different HEADERS such as userAgent, referrer, content-type and so on. At the end of the second request, doc should have the HTML of the site.

The reason that you cannot login to the site is that you are sending the post request without the session cookies, so it's an invalid request from the server.

TDG
  • 5,460
  • 2
  • 28
  • 49
  • and then another get request to the user data page? – mlz7 Jul 27 '15 at 17:50
  • @Mark After the second request, the `doc` contains the `HTML` code of the page that you see **after** the login. If you need another page, then you'll have to check how you can navigate there - probably you'll get another cookie from the server (for successfull logon). I recomend to use the browser's developer tools (`F12`) to analyze the communication between the browser (which you imitate by your program) and the server. – TDG Jul 27 '15 at 18:10
  • thanks this helped a lot and everything is working now. Im just interested, why do you need to use the logged in page's cookie to actually login? wouldn't it be the other way around? – mlz7 Jul 27 '15 at 20:52
  • Cookies are need to be obtained after login not before! Session is made after login so to do it u don't need any cookies but only for further requests to gather data after login – ceph3us Jul 31 '15 at 01:49