Python

How to change windows wallpaper using Python3

Following code can be used for changing windows wallpaper import ctypes ctypes.windll.user32.SystemParametersInfoW(20, 0, image , 0)

Python - search questions using wolframalpha

import json, requests def search(keyword): keyword = keyword.replace(' ', '+') data = requests.get('https://api.wolframalpha.com/v2/query?input=' + keyword + '&format=plaintext&output=json&appid=api_id_here') readable = data.json() results = readable['queryresult'] if results['success'] == True: for pod in results['pods']: if pod['title'] == 'Result': try: value = pod['subpods'][0]['plaintext'] return value except Exception as e: value = '' raise e else: print('Something went wrong. Below is full data') print(readable)

Python + Scrapy: find element with text

response.xpath("//*[contains(text(), 'txt goes here')]").getall()

How to update a request URL in Scrapy

Often times you might want to be able to manipulate every outgoing request. You don’t have to modify all points in your scrapper where you’re making requests. You can modify your middleware. Open middlewares.py in your project and paste following code in process_request method original_url = request.url new_url = 'modified_url_here' request = request.replace(url=new_url)

Python - read txt file separated by line as list

In a web crawling project of mine I had to read proxies from a txt file. Each proxy was on a new line. I used following code to read them as list. proxies = [line.rstrip('\n') for line in open(directory + '/proxies_v2.txt')] Additionally, I shuffle them to make sure their order is random. random.shuffle(proxies)

How to use proxies with Requests in Python

import requests proxy = 'proxy:goes:here' proxyDict = {"http" : proxy, "https" : proxy} response = requests.get('https://www.url.com', proxies=proxyDict) You can send the POST requests using a similar format. Headers can be added to the request using headers={}

How to retry failed requests in Scrapy

You can add one or more statuses in settings.py. Scrapy will process requests normally and when one of these statuses is encountered, it will retry that request. You can modify RETRY_HTTP_CODES and add any number of statuses there. You can also control how many times to try with RETRY_TIMES

How to filter and delete Google Chrome history items with Python

The following snippet can be used for filtering out URLs that where keywords match title or url. They are then deleted from history saving you lots of time that you’d otherwise spend manually filtering and deleting. Google Chrome must be closed before running this script. import sqlite3, webbrowser def cleanChrome(): con = sqlite3.connect('C:\\Users\\{username}\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\History') cursor = con.cursor() cursor.execute("select id, url, title from urls") urls = cursor.fetchall() # keywords that you'd like to detect and delete # it looks for these keywords in page title and urls keywords = [] total = len(urls) done = 0 deleted = 0 pendingIds = [] for url in urls: jenny.output(f'Processing {done} / {total} urls [deleted: {deleted}]') uid = url[0] link = url[1].lower() title = url[2].lower() for keyword in keywords: if keyword in link or keyword in title: jenny.output(f'{keyword} matched, deleting..') pendingIds.append((uid,)) deleted += 1 done += 1 query = 'DELETE FROM urls WHERE id=?' cursor.executemany(query, pendingIds) con.commit()

How to read a CSV file in python

with open(path, mode='r') as csvFile: reader = csv.DictReader(csvFile) lineCount = 0 for row in reader: if lineCount > 0: # do stuff here lineCount += 1

How to run a function when Scrapy spider closes

Scrapy spider can close unexpectedly for many reasons. If you’d like to notify yourself or do anything whenever a spider closes (expectedly or unexpectedly) Create a function named anything e.g crawlFinished() Then paste self.crawlFinish() at the bottom of closed() function Now your function will be executed each time crawler exits

How to open a URL in chrome using Python

import webbrowser webbrowser.register('chrome', None, webbrowser.BackgroundBrowser("C://Program Files (x86)//Google//Chrome//Application//chrome.exe")) webbrowser.get('chrome').open('http://url.com')

How to make scrapy spider interactive at any point

from scrapy.shell import inspect_response inspect_response(response, self) Read this for more details

Find element that has particular text in Scrapy

response.xpath("//*[contains(text(), 'txt goes here')]").getall()

How to use proxies with Scrapy

There are two main ways to use proxies in Scrapy. You can either use it per request basis or use it with every scrapy outgoing request. How to use proxy with a single request proxy = 'proxy_here' return Request(url=url, callback=self.parse, meta={"proxy": proxy}) How to use proxy with every request Go to to middlewares.py and update process_request method and paste following code proxy = 'proxy_here' request.meta['proxy'] = proxy

Python + Scrapy: retry failed requests

You can add one or more statuses in settings.py. Scrapy will process requests normally and when one of these statuses is encountered, it will retry that request. You can modify RETRY_HTTP_CODES and add any number of statuses there. You can also control how many times to try with RETRY_TIMES

How to handle multiple endpoints in same function in Flask

You can execute same code with a conditional if you want multiple endpoints hiting the same functions. This is awesome because it allows you to reduce code duplication and the same block of code can do multiple things with minor changes. Let’s say you want to diplay a page on following two urls site.com/blog/page and site.com/page @app.route('/blog/<slug>', endpoint='post') @app.route('/<slug>', endpoint='page') def post(slug): post = getPost(slug) if request.endpoint == 'post': title = post['title'] elif request.endpoint == 'page': title = 'This is a page' return render_template('post.html', post=post, title=title)

How to dynamically generate URL for assets in Flask

You can use url_for function for building URLs. For example, let’s say you have a assets folder called static in main app directory. You can use following code in template files to include CSS files {{ url_for('static', filename='css/style.css') }}

How to count files per day in a directory in Linux

find directory -type f -printf '%TY-%Tm-%Td\n' | sort | uniq -c

Markdown Syntax Guide

This article offers a sample of basic Markdown syntax that can be used in Hugo content files, also it shows whether basic HTML elements are decorated with CSS in a Hugo theme. ...