Commit 1eeeaa57 authored by Loïc Barrault's avatar Loïc Barrault
Browse files

re-added ressources

parent 86f58b09
## Makefile
Makefile
## Core latex/pdflatex auxiliary files:
*.aux
*.lof
*.log
*.lot
*.fls
*.out
*.toc
*.fmt
## Intermediate documents:
*.dvi
*-converted-to.*
# these rules might exclude image files for figures etc.
# *.ps
# *.eps
*.pdf
## Bibliography auxiliary files (bibtex/biblatex/biber):
*.bbl
*.bcf
*.blg
*-blx.aux
*-blx.bib
*.brf
*.run.xml
## Build tool auxiliary files:
*.fdb_latexmk
*.synctex
*.synctex.gz
*.synctex.gz(busy)
*.pdfsync
## Auxiliary and intermediate files from other packages:
# algorithms
*.alg
*.loa
# achemso
acs-*.bib
# amsthm
*.thm
# beamer
*.nav
*.snm
*.vrb
# cprotect
*.cpt
#(e)ledmac/(e)ledpar
*.end
*.[1-9]
*.[1-9][0-9]
*.[1-9][0-9][0-9]
*.[1-9]R
*.[1-9][0-9]R
*.[1-9][0-9][0-9]R
*.eledsec[1-9]
*.eledsec[1-9]R
*.eledsec[1-9][0-9]
*.eledsec[1-9][0-9]R
*.eledsec[1-9][0-9][0-9]
*.eledsec[1-9][0-9][0-9]R
# glossaries
*.acn
*.acr
*.glg
*.glo
*.gls
# gnuplottex
*-gnuplottex-*
# hyperref
*.brf
# knitr
*-concordance.tex
*.tikz
*-tikzDictionary
# listings
*.lol
# makeidx
*.idx
*.ilg
*.ind
*.ist
# minitoc
*.maf
*.mtc
*.mtc[0-9]
*.mtc[1-9][0-9]
# minted
_minted*
*.pyg
# morewrites
*.mw
# mylatexformat
*.fmt
# nomencl
*.nlo
# sagetex
*.sagetex.sage
*.sagetex.py
*.sagetex.scmd
# sympy
*.sout
*.sympy
sympy-plots-for-*.tex/
# pdfcomment
*.upa
*.upb
#pythontex
*.pytxcode
pythontex-files-*/
# Texpad
.texpadtmp
# TikZ & PGF
*.dpth
*.md5
*.auxlock
# todonotes
*.tdo
# xindy
*.xdy
# xypic precompiled matrices
*.xyc
# WinEdt
*.bak
*.sav
# endfloat
*.ttt
*.fff
# Latexian
TSWLatexianTemp*
The MIT License (MIT)
Copyright (c) 2016
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
# Convolution arithmetic
A technical report on convolution arithmetic in the context of deep learning.
## Convolution animations
<table style="width:100%">
<tr>
<td><img src="gif/no_padding_no_strides.gif"></td>
<td><img src="gif/arbitrary_padding_no_strides.gif"></td>
<td><img src="gif/same_padding_no_strides.gif"></td>
<td><img src="gif/full_padding_no_strides.gif"></td>
</tr>
<tr>
<td>No padding, no strides</td>
<td>Arbitrary padding, no strides</td>
<td>Half padding, no strides</td>
<td>Full padding, no strides</td>
</tr>
<tr>
<td><img src="gif/no_padding_no_strides_transposed.gif"></td>
<td><img src="gif/arbitrary_padding_no_strides_transposed.gif"></td>
<td><img src="gif/same_padding_no_strides_transposed.gif"></td>
<td><img src="gif/full_padding_no_strides_transposed.gif"></td>
</tr>
<tr>
<td>No padding, no strides, transposed</td>
<td>Arbitrary padding, no strides, transposed</td>
<td>Half padding, no strides, transposed</td>
<td>Full padding, no strides, transposed</td>
</tr>
<tr>
<td><img src="gif/no_padding_strides.gif"></td>
<td><img src="gif/padding_strides.gif"></td>
<td><img src="gif/padding_strides_odd.gif"></td>
<td></td>
</tr>
<tr>
<td>No padding, strides</td>
<td>Padding, strides</td>
<td>Padding, strides (odd)</td>
<td></td>
</tr>
<tr>
<td><img src="gif/no_padding_strides_transposed.gif"></td>
<td><img src="gif/padding_strides_transposed.gif"></td>
<td><img src="gif/padding_strides_odd_transposed.gif"></td>
<td></td>
</tr>
<tr>
<td>No padding, strides, transposed</td>
<td>Padding, strides, transposed</td>
<td>Padding, strides, transposed (odd)</td>
<td></td>
</tr>
<tr>
<td><img src="gif/dilation.gif"></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>No padding, no stride, dilation</td>
<td></td>
<td></td>
<td></td>
</tr>
</table>
## Generating the Makefile
From the repository's root directory:
``` bash
$ ./bin/generate_makefile
```
## Generating the animations
From the repository's root directory:
``` bash
$ make all_animations
```
The animations will be output to the `gif` directory. Individual animation steps
will be output in PDF format to the `pdf` directory and in PNG format to the
`png` directory.
## Compiling the document
From the repository's root directory:
``` bash
$ make
```
@inproceedings{le1997reading,
title={Reading checks with multilayer graph transformer networks},
author={Le Cun, Yann and Bottou, L{\'e}on and Bengio, Yoshua},
booktitle={Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on},
volume={1},
pages={151--154},
year={1997},
organization={IEEE}
}
@inproceedings{bergstra2010theano,
title={Theano: A CPU and GPU math compiler in Python},
author={Bergstra, James and Breuleux, Olivier and Bastien, Fr{\'e}d{\'e}ric and Lamblin, Pascal and Pascanu, Razvan and Desjardins, Guillaume and Turian, Joseph and Warde-Farley, David and Bengio, Yoshua},
booktitle={Proc. 9th Python in Science Conf},
pages={1--7},
year={2010}
}
@inproceedings{collobert2011torch7,
title={Torch7: A matlab-like environment for machine learning},
author={Collobert, Ronan and Kavukcuoglu, Koray and Farabet, Cl{\'e}ment},
booktitle={BigLearn, NIPS Workshop},
number={EPFL-CONF-192376},
year={2011}
}
@inproceedings{zeiler2011adaptive,
title={Adaptive deconvolutional networks for mid and high level feature learning},
author={Zeiler, Matthew D and Taylor, Graham W and Fergus, Rob},
booktitle={Computer Vision (ICCV), 2011 IEEE International Conference on},
pages={2018--2025},
year={2011},
organization={IEEE}
}
@inproceedings{krizhevsky2012imagenet,
title={Imagenet classification with deep convolutional neural networks},
author={Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E},
booktitle={Advances in neural information processing systems},
pages={1097--1105},
year={2012}
}
@article{bastien2012theano,
title={Theano: new features and speed improvements},
author={Bastien, Fr{\'e}d{\'e}ric and Lamblin, Pascal and Pascanu, Razvan and Bergstra, James and Goodfellow, Ian and Bergeron, Arnaud and Bouchard, Nicolas and Warde-Farley, David and Bengio, Yoshua},
journal={arXiv preprint arXiv:1211.5590},
year={2012}
}
@inproceedings{jia2014caffe,
title={Caffe: Convolutional architecture for fast feature embedding},
author={Jia, Yangqing and Shelhamer, Evan and Donahue, Jeff and Karayev, Sergey and Long, Jonathan and Girshick, Ross and Guadarrama, Sergio and Darrell, Trevor},
booktitle={Proceedings of the ACM International Conference on Multimedia},
pages={675--678},
year={2014},
organization={ACM}
}
@incollection{zeiler2014visualizing,
title={Visualizing and understanding convolutional networks},
author={Zeiler, Matthew D and Fergus, Rob},
booktitle={Computer vision--ECCV 2014},
pages={818--833},
year={2014},
publisher={Springer}
}
@article{abaditensorflow,
title={TensorFlow: Large-scale machine learning on heterogeneous systems},
author={Abadi, Mart{\i}n and Agarwal, Ashish and Barham, Paul and Brevdo, Eugene and Chen, Zhifeng and Citro, Craig and Corrado, Greg S and Davis, Andy and Dean, Jeffrey and Devin, Matthieu and others},
journal={Software available from tensorflow.org},
year={2015}
}
@inproceedings{long2015fully,
title={Fully convolutional networks for semantic segmentation},
author={Long, Jonathan and Shelhamer, Evan and Darrell, Trevor},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={3431--3440},
year={2015}
}
@article{radford2015unsupervised,
title={Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks},
author={Radford, Alec and Metz, Luke and Chintala, Soumith},
journal={arXiv preprint arXiv:1511.06434},
year={2015}
}
@unpublished{Goodfellow-et-al-2016-Book,
title={Deep Learning},
author={Goodfellow, Ian and Bengio, Yoshua and Courville, Aaron},
note={Book in preparation for MIT Press},
url={http://goodfeli.github.io/dlbook/},
year={2016}
}
@article{im2016generating,
title={Generating images with recurrent adversarial networks},
author={Im, Daniel Jiwoong and Kim, Chris Dongjoo and Jiang, Hui and Memisevic, Roland},
journal={arXiv preprint arXiv:1602.05110},
year={2016}
}
@article{visin15,
author = {Francesco Visin and
Kyle Kastner and
Aaron C. Courville and
Yoshua Bengio and
Matteo Matteucci and
KyungHyun Cho},
title = {ReSeg: {A} Recurrent Neural Network for Object Segmentation},
year = {2015},
url = {http://arxiv.org/abs/1511.07053},
}
@inproceedings {boureau-cvpr-10,
title = "Learning Mid-Level Features for Recognition",
author = "Boureau, {Y-Lan} and Bach, Francis and LeCun, Yann and Ponce, Jean",
booktitle = "Proc. International Conference on Computer Vision and Pattern Recognition (CVPR'10)",
publisher = "IEEE",
year = "2010"
}
@inproceedings {boureau-icml-10,
title = "A theoretical analysis of feature pooling in vision algorithms",
author = "Boureau, {Y-Lan} and Ponce, Jean and LeCun, Yann",
booktitle = "Proc. International Conference on Machine learning (ICML'10)",
year = "2010"
}
@inproceedings {boureau-iccv-11,
title = "Ask the locals: multi-way local pooling for image recognition",
author = "Boureau, {Y-Lan} and {Le Roux}, Nicolas and Bach, Francis and Ponce, Jean and LeCun, Yann",
booktitle = "Proc. International Conference on Computer Vision (ICCV'11)",
publisher = "IEEE",
year = "2011"
}
@InProceedings{ICML2011Saxe_551,
author = {Andrew Saxe and Pang Wei Koh and Zhenghao Chen and Maneesh Bhand and Bipin Suresh and Andrew Ng},
title = {On Random Weights and Unsupervised Feature Learning },
booktitle = {Proceedings of the 28th International Conference on Machine Learning (ICML-11)},
series = {ICML '11},
year = {2011},
editor = {Lise Getoor and Tobias Scheffer},
location = {Bellevue, Washington, USA},
isbn = {978-1-4503-0619-5},
month = {June},
publisher = {ACM},
address = {New York, NY, USA},
pages= {1089--1096},
}
#!/usr/bin/env python
from six import iteritems
from six.moves import range
arithmetic_files = ('bin/produce_figure templates/arithmetic_figure.txt '
'templates/unit.txt')
numerical_files = 'bin/produce_figure templates/numerical_figure.txt'
animations = (
('no_padding_no_strides',
('arithmetic', 4, 2, 0, 3, 1, 1, 'convolution', False)),
('no_padding_no_strides_transposed',
('arithmetic', 4, 2, 0, 3, 1, 1, 'convolution', True)),
('arbitrary_padding_no_strides',
('arithmetic', 5, 6, 2, 4, 1, 1, 'convolution', False)),
('arbitrary_padding_no_strides_transposed',
('arithmetic', 5, 6, 2, 4, 1, 1, 'convolution', True)),
('same_padding_no_strides',
('arithmetic', 5, 5, 1, 3, 1, 1, 'convolution', False)),
('same_padding_no_strides_transposed',
('arithmetic', 5, 5, 1, 3, 1, 1, 'convolution', True)),
('full_padding_no_strides',
('arithmetic', 5, 7, 2, 3, 1, 1, 'convolution', False)),
('full_padding_no_strides_transposed',
('arithmetic', 5, 7, 2, 3, 1, 1, 'convolution', True)),
('no_padding_strides',
('arithmetic', 5, 2, 0, 3, 2, 1, 'convolution', False)),
('no_padding_strides_transposed',
('arithmetic', 5, 2, 0, 3, 2, 1, 'convolution', True)),
('padding_strides',
('arithmetic', 5, 3, 1, 3, 2, 1, 'convolution', False)),
('padding_strides_transposed',
('arithmetic', 5, 3, 1, 3, 2, 1, 'convolution', True)),
('padding_strides_odd',
('arithmetic', 6, 3, 1, 3, 2, 1, 'convolution', False)),
('padding_strides_odd_transposed',
('arithmetic', 6, 3, 1, 3, 2, 1, 'convolution', True)),
('dilation',
('arithmetic', 7, 3, 0, 3, 1, 2, 'convolution', False)),
('numerical_no_padding_no_strides',
('numerical', 5, 3, 0, 3, 1, 1, 'convolution', False)),
('numerical_padding_strides',
('numerical', 5, 3, 1, 3, 2, 1, 'convolution', False)),
('numerical_average_pooling',
('numerical', 5, 3, 0, 3, 1, 1, 'average', False)),
('numerical_max_pooling',
('numerical', 5, 3, 0, 3, 1, 1, 'max', False)),
)
fields = ('type', 'input-size', 'output-size', 'padding', 'kernel-size',
'stride', 'dilation', 'mode', 'transposed')
animations = dict([(name, dict(zip(fields, config)))
for name, config in animations])
def make_header():
return ('.PHONY : all_animations\nall_animations : {}\n\n'.format(
' '.join(['gif/{}.gif'.format(name)
for name in animations.keys()])) +
'.SECONDARY : \n')
def make_report_section():
return ('conv_arithmetic.pdf : export BSTINPUTS=$BSTINPUTS:./natbib\n'
'conv_arithmetic.pdf : conv_arithmetic.tex\n'
'\tpdflatex conv_arithmetic\n'
'\tpdflatex conv_arithmetic\n'
'\tbibtex conv_arithmetic\n'
'\tpdflatex conv_arithmetic\n'
'\tpdflatex conv_arithmetic\n\n'
'.PHONY : clean\n'
'clean : \n'
'\trm -f conv_arithmetic.{aux,bbl,blg,log}\n')
def make_gif_section():
rules = []
for name, config in iteritems(animations):
if config['transposed']:
steps = config['input-size'] ** 2
else:
steps = config['output-size'] ** 2
rules.append(
'gif/{}.gif : '.format(name) +
' '.join(['png/{}_{:02d}.png'.format(name, i)
for i in range(steps)]) + '\n' +
'\tconvert -delay 100 -loop 0 -layers Optimize +map -dispose previous $^ $@\n' +
'\tgifsicle --batch -O3 $@\n')
return '\n'.join(rules)
def make_png_section():
return ('png/%.png : pdf/%.pdf\n'
'\tconvert -density 600 $< -flatten -resize 25% $@\n')
def make_pdf_section():
rules = []
for name, config in iteritems(animations):
if config['transposed']:
steps = config['input-size'] ** 2
else:
steps = config['output-size'] ** 2
if config['type'] == 'arithmetic':
dependencies = arithmetic_files
else:
dependencies = numerical_files
subrules = []
for i in range(steps):
subrules.append(
'pdf/{}_{:02d}.pdf : {}\n'.format(name, i, dependencies) +
'\t./bin/produce_figure ' +
'{} {} {} '.format(config['type'], name, i) +
'--input-size={} '.format(config['input-size']) +
'--output-size={} '.format(config['output-size']) +
'--padding={} '.format(config['padding']) +
'--kernel-size={} '.format(config['kernel-size']) +
'--stride={} '.format(config['stride']) +
('--dilation={} '.format(config['dilation'])
if config['dilation'] != 1 else '' ) +
('--mode={} '.format(config['mode'])
if config['type'] == 'numerical' else '') +
('--transposed\n' if config['transposed'] else '\n'))
rules.append('\n'.join(subrules))
return '\n'.join(rules)
def main():
with open('Makefile', 'w') as makefile:
makefile.write('\n'.join([make_report_section(), make_header(),
make_gif_section(), make_png_section(),
make_pdf_section()]))
if __name__ == "__main__":
main()
#!/usr/bin/env python
import argparse
import itertools
import os
import subprocess
from glob import glob
import numpy
import six
numpy.random.seed(1234)
def make_numerical_tex_string(step, input_size, output_size, padding,
kernel_size, stride, dilation, mode):
"""Creates a LaTeX string for a numerical convolution animation.