Skip to content

Commit c3c7bd8

Browse files
authored
Merge pull request #30 from vgalin/V2.0.0
V2.0.0
2 parents cc5b39c + c9caab6 commit c3c7bd8

File tree

8 files changed

+357
-183
lines changed

8 files changed

+357
-183
lines changed

README.md

Lines changed: 82 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -49,10 +49,10 @@ hti = Html2Image()
4949
<summary> Multiple arguments can be passed to the constructor (click to expand):</summary>
5050

5151
- `browser` : Browser that will be used, set by default to `'chrome'` (the only browser supported by HTML2Image at the moment)
52-
- `chrome_path` and `firefox_path` : The path or the command that can be used to find the executable of a specific browser.
52+
- `browser_path` : The path or the command that can be used to find the executable of a specific browser.
5353
- `output_path` : Path to the folder to which taken screenshots will be outputed. Default is the current working directory of your python program.
5454
- `size` : 2-Tuple reprensenting the size of the screenshots that will be taken. Default value is `(1920, 1080)`.
55-
- `temp_path` : Path that will be used by html2image to put together different resources *loaded* with the `load_str` and `load_file` methods. Default value is `%TEMP%/html2image` on Windows, and `/tmp/html2image` on Linux and MacOS.
55+
- `temp_path` : Path that will be used to put together different resources when screenshotting strings of files. Default value is `%TEMP%/html2image` on Windows, and `/tmp/html2image` on Linux and MacOS.
5656

5757
Example:
5858
```python
@@ -208,6 +208,62 @@ print(paths)
208208
# >>> ['D:\\myFiles\\letters_0.png', 'D:\\myFiles\\letters_1.png', 'D:\\myFiles\\letters_2.png']
209209
```
210210

211+
---
212+
213+
#### Change browser flags
214+
In some cases, you may need to change the *flags* that are used to run the headless mode of a browser.
215+
216+
Flags can be used to:
217+
- Change the default background color of the pages;
218+
- Hide the scrollbar;
219+
- Add delay before taking a screenshot;
220+
- Allow you to use Html2Image when you're root, as you will have to specify the `--no-sandbox` flag;
221+
222+
You can find the full list of Chrome / Chromium flags [here](https://peter.sh/experiments/chromium-command-line-switches/).
223+
224+
There is two ways to specify custom flags:
225+
```python
226+
# At the object instanciation
227+
hti = Html2image(custom_flags=['--my_flag', '--my_other_flag=value'])
228+
229+
# Afterwards
230+
hti.browser.flags = ['--my_flag', '--my_other_flag=value']
231+
```
232+
233+
- **Flags example use-case: adding a delay before taking a screenshot**
234+
235+
With Chrome / Chromium, screenshots are fired directly after there is no more "pending network fetches", but you may sometimes want to add a delay before taking a screenshot, to wait for animations to end for example.
236+
There is a flag for this purpose, `--virtual-time-budget=VALUE_IN_MILLISECONDS`. You can use it like so:
237+
238+
```python
239+
hti = Html2Image(
240+
custom_flags=['--virtual-time-budget=10000', '--hide-scrollbars']
241+
)
242+
243+
hti.screenshot(url='http://example.org')
244+
```
245+
246+
- **Default flags**
247+
248+
For ease of use, some flags are set by default. However default flags are not used if you decide to specify `custom_flags` or change the value of `browser.flags`:
249+
250+
```python
251+
# Taking a look at the default flags
252+
>>> hti = Html2Image()
253+
>>> hti.browser.flags
254+
['--default-background-color=0', '--hide-scrollbars']
255+
256+
# Changing the value of browser.flags gets rid of the default flags.
257+
>>> hti.browser.flags = ['--1', '--2']
258+
>>> hti.browser.flags
259+
['--1', '--2']
260+
261+
# Using the custom_flags parameter gets rid of the default flags.
262+
>>> hti = Html2Image(custom_flags=['--a', '--b'])
263+
>>> hti.browser.flags
264+
['--a', '--b']
265+
```
266+
211267
## Using the CLI
212268
HTML2image comes with a Command Line Interface which you can use to generate screenshots from files and urls on the go.
213269

@@ -234,16 +290,32 @@ You can call it by typing `hti` or `html2image` into a terminal.
234290

235291
## Testing
236292

237-
Only basic testing is available at the moment. To run tests, run PyTest at the root of the project:
238-
```
293+
Only basic testing is available at the moment. To run tests, install the requirements (Pillow) and run PyTest at the root of the project:
294+
``` console
295+
pip install -r requirements-test.txt
239296
python -m pytest
240297
```
241298

299+
300+
## FAQ
301+
302+
- Can I automatically take a full page screenshot?
303+
**Sadly no**, it is not easily possible. Html2Image relies on the headless mode of Chrome/Chromium browsers to take screenshots and there is no way to "ask" for a full page screenshot at the moment. If you know a way to take one (by estimating the page size for example) I would be happy to see it, so please open an issue or a discussion!
304+
305+
- Can I add delay before taking a screenshot?
306+
**Yes** you can, please take a look at the `Change browser flags` section of the readme.
307+
308+
- Can I speed up the screenshot taking process?
309+
**Yes**, when you are taking a lot of screenshots, you can achieve better "performances" using Parallel Processing or Multiprocessing methods. You can find an [example of it here](https://github.com/vgalin/html2image/issues/28#issuecomment-862608053).
310+
311+
- Can I make a cookie modal disappear?
312+
**Yes and no**. **No** because there is no options to do it magically and [extensions are not supported in headless Chrome](https://bugs.chromium.org/p/chromium/issues/detail?id=706008#c5) (The [`I don't care about cookies`](https://www.i-dont-care-about-cookies.eu/) extension would have been useful in this case). **Yes** because you can make any element of a page disappear by retrieving its source code, modifying it as you wish, and finally screenshotting the modified source code.
242313
## TODO List
243-
- A nice CLI (Currently in a WIP state)
244-
- A better way to name the CLI's outputed files ?
245-
- Support of other browsers, such as Firefox
246-
- More extensive doc + comments
314+
- A nice CLI (currently in a WIP state).
315+
- Support of other browsers (such as Firefox when their screenshot feature will work).
247316
- PDF generation?
248-
- Testing on push/PR with GitHub Actions
249-
- Use threads or multiprocessing to speed up screenshot taking
317+
- Contributing, issue templates, pull request template, code of conduct.
318+
319+
---
320+
321+
*If you see any typos or notice things that are odly said, feel free to create an issue or a pull request.*

html2image/browsers/__init__.py

Whitespace-only changes.

html2image/browsers/browser.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
from abc import ABC, abstractmethod
2+
3+
4+
class Browser(ABC):
5+
"""Abstract class representing a web browser."""
6+
7+
def __init__(self, flags):
8+
pass
9+
10+
@property
11+
@abstractmethod
12+
def executable_path(self):
13+
pass
14+
15+
@executable_path.setter
16+
@abstractmethod
17+
def executable_path(self, value):
18+
pass
19+
20+
@abstractmethod
21+
def screenshot(self, *args, **kwargs):
22+
pass

html2image/browsers/chrome.py

Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
from .browser import Browser
2+
3+
import subprocess
4+
import platform
5+
import os
6+
import shutil
7+
8+
9+
def _find_chrome(user_given_path=None):
10+
""" Finds a Chrome executable.
11+
12+
Search Chrome on a given path. If no path given,
13+
try to find Chrome or Chromium-browser on a Windows or Unix system.
14+
15+
Raises
16+
------
17+
- `FileNotFoundError`
18+
+ If a suitable chrome executable could not be found.
19+
20+
Returns
21+
-------
22+
- str
23+
+ Path of the chrome executable on the current machine.
24+
"""
25+
26+
# TODO when other browsers will be available:
27+
# Ensure that the given executable is a chrome one.
28+
29+
if user_given_path is not None:
30+
if os.path.isfile(user_given_path):
31+
return user_given_path
32+
else:
33+
raise FileNotFoundError('Could not find chrome in the given path.')
34+
35+
if platform.system() == 'Windows':
36+
prefixes = [
37+
os.getenv('PROGRAMFILES(X86)'),
38+
os.getenv('PROGRAMFILES'),
39+
os.getenv('LOCALAPPDATA'),
40+
]
41+
42+
suffix = "Google\\Chrome\\Application\\chrome.exe"
43+
44+
for prefix in prefixes:
45+
path_candidate = os.path.join(prefix, suffix)
46+
if os.path.isfile(path_candidate):
47+
return path_candidate
48+
49+
elif platform.system() == "Linux":
50+
51+
# search google-chrome
52+
version_result = subprocess.check_output(
53+
["google-chrome", "--version"]
54+
)
55+
56+
if 'Google Chrome' in str(version_result):
57+
return "google-chrome"
58+
59+
# else search chromium-browser
60+
61+
# snap seems to be a special case?
62+
# see https://stackoverflow.com/q/63375327/12182226
63+
version_result = subprocess.check_output(
64+
["chromium-browser", "--version"]
65+
)
66+
if 'snap' in str(version_result):
67+
chrome_snap = (
68+
'/snap/chromium/current/usr/lib/chromium-browser/chrome'
69+
)
70+
if os.path.isfile(chrome_snap):
71+
return chrome_snap
72+
else:
73+
which_result = shutil.which('chromium-browser')
74+
if which_result is not None and os.path.isfile(which_result):
75+
return which_result
76+
77+
elif platform.system() == "Darwin":
78+
# MacOS system
79+
chrome_app = (
80+
'/Applications/Google Chrome.app/Contents/MacOS/Google Chrome'
81+
)
82+
version_result = subprocess.check_output(
83+
[chrome_app, "--version"]
84+
)
85+
if "Google Chrome" in str(version_result):
86+
return chrome_app
87+
88+
raise FileNotFoundError(
89+
'Could not find a Chrome executable on this '
90+
'machine, please specify it yourself.'
91+
)
92+
93+
94+
class ChromeHeadless(Browser):
95+
"""
96+
Chrome/Chromium browser wrapper.
97+
98+
Parameters
99+
----------
100+
- `executable_path` : str, optional
101+
+ Path to a chrome executable.
102+
103+
- `flags` : list of str
104+
+ Flags to be used by the headless browser.
105+
+ Default flags are :
106+
- '--default-background-color=0'
107+
- '--hide-scrollbars'
108+
- `print_command` : bool
109+
+ Whether or not to print the command used to take a screenshot.
110+
"""
111+
112+
def __init__(self, executable_path=None, flags=None, print_command=False):
113+
self.executable_path = executable_path
114+
if not flags:
115+
self.flags = [
116+
'--default-background-color=0',
117+
'--hide-scrollbars',
118+
]
119+
else:
120+
self.flags = [flags] if isinstance(flags, str) else flags
121+
122+
self.print_command = print_command
123+
124+
@property
125+
def executable_path(self):
126+
return self._executable_path
127+
128+
@executable_path.setter
129+
def executable_path(self, value):
130+
self._executable_path = _find_chrome(value)
131+
132+
def screenshot(
133+
self,
134+
input,
135+
output_path,
136+
output_file='screenshot.png',
137+
size=(1920, 1080),
138+
):
139+
""" Calls Chrome or Chromium headless to take a screenshot.
140+
141+
Parameters
142+
----------
143+
- `output_file`: str
144+
+ Name as which the screenshot will be saved.
145+
+ File extension (e.g. .png) has to be included.
146+
+ Default is screenshot.png
147+
- `input`: str
148+
+ File or url that will be screenshotted.
149+
+ Cannot be None
150+
- `size`: (int, int), optional
151+
+ Two values representing the window size of the headless
152+
+ browser and by extention, the screenshot size.
153+
+ These two values must be greater than 0.
154+
Raises
155+
------
156+
- `ValueError`
157+
+ If the value of `size` is incorrect.
158+
+ If `input` is empty.
159+
"""
160+
161+
if not input:
162+
raise ValueError('The `input` parameter is empty.')
163+
164+
if size[0] < 1 or size[1] < 1:
165+
raise ValueError(
166+
f'Could not screenshot "{output_file}" '
167+
f'with a size of {size}:\n'
168+
'A valid size consists of two integers greater than 0.'
169+
)
170+
171+
# command used to launch chrome in
172+
# headless mode and take a screenshot
173+
command = [
174+
f'{self.executable_path}',
175+
'--headless',
176+
f'--screenshot={os.path.join(output_path, output_file)}',
177+
f'--window-size={size[0]},{size[1]}',
178+
*self.flags,
179+
f'{input}',
180+
]
181+
182+
if self.print_command:
183+
print(' '.join(command))
184+
185+
subprocess.run(command)

html2image/browsers/firefox.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
from .browser import Browser
2+
3+
4+
class FirefoxHeadless(Browser):
5+
6+
def __init__(self):
7+
raise NotImplementedError(
8+
"Could not make screenshot work on Firefox headless yet ...\n"
9+
"See https://bugzilla.mozilla.org/show_bug.cgi?id=1715450"
10+
)
11+
12+
@property
13+
def executable_path(self):
14+
pass
15+
16+
@executable_path.setter
17+
def executable_path(self, value):
18+
pass
19+
20+
def render(self, **kwargs):
21+
pass

0 commit comments

Comments
 (0)