Now this post discusses about how to set up proxy on your
AWS EC2.
First, if you are completely new to EC2 and want to learn
how to set up and connect to EC2, I would highly recommend this video. AWS
provide a free micro instance for 1 year so go ahead and try it.:
Now I assume you have set up your EC2 and connected with
that using putty.
Know your
current public IP:
Now to know your public IP you can log in to your AWS
console < https://aws.amazon.com/console/ > and check that:
1. We are
more interested on how outer world sees us. So, let’s check this on our AWS
instance. Type this on your AWS:
>>> curl checkip.dyndns.org
This would return something like:
<html><head><title>Current
IP Check</title></head><body>Current IP Address: 245. 902.34.171</body></html>
When there is no
proxy this two should match.
2. At this moment if you type: curl www.nseindia.com . you will get “blocked” message in return.
Ref :
1. https://www.shellhacks.com/linux-proxy-server-settings-set-proxy-command-line/
Now our goal is to change this IP.
Setting
up proxies:
1. This one is very simple you just need to change your IP ,
PORT , user and password in below commands and run on EC2
export
http_proxy=username:password@hostIP:PORT
export
https_proxy=username:password@hostIP:PORT
exprot no_proxy=localhost,
127.0.0.1, *.my.lan
example:
export http_proxy=http://DFER5:qyR7t@238.60.64.145:7700/
2. read back these variables:
env | grep -i proxy
3. see if your proxy is working or not?
Again, run the same curl command and see the return IP, it
should be your proxy IP.
>>> curl checkip.dyndns.org
If you find any difficulty read above 2 reference links.
4. if you wish to check PROXY without setting up this
variable, you can test using :
curl -x http://IP:PORT/
--proxy-user USER:PASSWORD -L checkip.dyndns.org
5. Now let us try to access our blocked website www.nseindia.in
Try: curl www.nseindia.com
This time
it will not return earlier block message, but this will hang. And may be after long
time it will return curl: (52) Empty reply from server
6. This is because
we need to deliver proper headers with curl request. If you try some python
modules like nsepy ( pip install nsepy ) this will work.
>> python3
Python 3.8.2 (default, Jul 16 2020, 14:00:26)
[GCC 9.3.0] on linux
Type "help", "copyright",
"credits" or "license" for more information.
>>> import nsepy
>>> from datetime import date
>>> print(nsepy.get_history(symbol='SBIN',
... start=date(2015,1,1),
... end=date(2015,1,2)))
Output :
Symbol
Series Prev Close Open
High Low Last
Close VWAP Volume
Turnover Trades Deliverable Volume %Deliverble
Date
2015-01-01
SBIN EQ 311.85
312.45 315.0 310.70
314.0 314.00 313.67
6138488 1.925489e+14 58688 1877677 0.3059
2015-01-02
SBIN EQ 314.00
314.35 318.3 314.35
315.6 315.25 316.80
9935094 3.147389e+14 79553 4221685 0.4249
Note : it’s better you add your /etc/environment
file as well , to avoid setting this variables all the time , if you want to
use proxy permanently . You can refer above 2 reference posts for that.
If you have any difficulty following this let me know. I have some topic in my mind for next post like using POSTMAN for effective web scrapping.