tvsubtitles query
This commit is contained in:
parent
e125bbc02b
commit
dbbcb31154
|
@ -0,0 +1,32 @@
|
|||
PigeonHole
|
||||
==========
|
||||
|
||||
The main purpose of this application is to sort some specific types of files into a well-arranged directory.
|
||||
|
||||
I used it for classifying tv shows from a garbage folder into the right one, based on the filename which will be cleaned to help sorting.
|
||||
|
||||
How it works
|
||||
------------
|
||||
|
||||
The project is splitted into several files :
|
||||
* pigeonhole/pigeonhole.py : the one that should be run :)
|
||||
* setup.py : not used yet, sorry.
|
||||
* pigeonhole/config.py : where you should put your configuration.
|
||||
|
||||
### config.py ###
|
||||
|
||||
The configuration file contains the declaration of three variables :
|
||||
|
||||
1. useless_files_extensions : used to clean a folder when the content of this directory (and its subdirectories) is only composed by this kind of files. Do not try to put `*` inside this filter, I don't know the behavior yet...
|
||||
2. shows_extensions : the files that need to be organized. The `process` method of the `PigeonHole` class won't look for anything else than these filetype, based the recognition of extensions and not on [magic numbers](http://en.wikipedia.org/wiki/List_of_file_signatures).
|
||||
3. shows_dict : used for file that have a 'special name'
|
||||
(ie. using 'tbbt' while the real name that can be found in the destination folder is much much longer)
|
||||
|
||||
Unit testing
|
||||
------------
|
||||
|
||||
All tests are located inside the `pigeonhole/tests` directory. To launch them, use the following command, based on the python handbook:
|
||||
|
||||
python -m unittest discover
|
||||
|
||||
Temporary files and folders are created (and cleaned) to verify that the file behavior is going okay.
|
|
@ -0,0 +1,32 @@
|
|||
import urllib2
|
||||
import re
|
||||
from BeautifulSoup import BeautifulSoup
|
||||
|
||||
"""
|
||||
Querying non web services through http interrogation and regex results retrieval.
|
||||
"""
|
||||
|
||||
def query(showname):
|
||||
print "Trying " + showname
|
||||
socket = urllib2.urlopen('http://www.tvsubtitles.net/search.php?q=' + showname.replace(' ', '%20'))
|
||||
soup = BeautifulSoup(socket.read())
|
||||
socket.close()
|
||||
|
||||
results = soup.findAll(href=re.compile("/tvshow-([A-Za-z0-9]*)\.html$"))
|
||||
|
||||
if len(results) == 1:
|
||||
"ouh yeah baby " + showname + " " + str(results[0])
|
||||
|
||||
elif len(results) == 0:
|
||||
print "No results found for " + showname
|
||||
else:
|
||||
print "Here are the possible results for " + showname
|
||||
for res in results:
|
||||
print "\t" + str(res)
|
||||
|
||||
if __name__ == "__main__":
|
||||
query('the big bang theory')
|
||||
query('being erica')
|
||||
query('white collar')
|
||||
query('scrubs')
|
||||
query('castle')
|
Loading…
Reference in New Issue