Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submission from @Swapy11 #5

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 15 additions & 104 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,115 +1,26 @@
# PactFlow Python Coding Test

Welcome to this Python coding test. This test is designed to assess your ability
to write clean, well-structured, and maintainable code. You will be tasked with
adding some functionality to this codebase.

We will be looking for the following aspects:

1. The readability and clarity of your code; including aspects such as:
- Naming conventions
- Code structure
- Comments
- Documentation
2. The correctness of your code; including aspects such as:
- Handling of edge cases
- Error handling
- Testing
3. The maintainability of your code; including aspects such as:
- Modularity
- Extensibility
- Reusability
4. Your familiarity with standard development tools and practices; including
aspects such as:
- Version control
- Creating and using virtual environments
- Documenting PRs and commits

Please fork this repository and submit your solution as a pull request. Your
solution should pass the existing CI checks, and you should ensure that your
code is tested. This project uses the [pytest](https://docs.pytest.org/en/stable/) testing framework.

## Development

This project uses [Hatch](https://hatch.pypa.io) for managing the development
environment. The code is split across three packages:

- `pypacter`: The core logic
- `pypacter-api`: API wrapper
- `pypacter-cli`: CLI to interact with the API

The structure of the project is as follows:

```text
pypacter/
├── pypacter-api/ <== API wrapper
│ ├── src/pypacter_api/
│ ├── tests/
│ ├── pyproject.toml
│ └── README.md
├── pypacter-cli/ <== CLI to interact with API
│ ├── src/pypacter_cli/
│ ├── tests/
│ ├── pyproject.toml
│ └── README.md
├── notebooks/ <== Jupyter notebooks (if any)
├── src/
│ └── pypacter/ <== Core logic
├── tests/
├── mkdocs.yml
├── pyproject.toml
└── README.md
```

## Tasks

The following tasks purposefully leave out some specificity to allow you to
demonstrate your problem-solving skills, and give you the opportunity to make
decisions about the implementation.

Each task should only take about 30 minutes to complete, and you should also allow 30 minutes to familiarize yourself with the codebase. If you find yourself spending more time on a task, submit what you have and document in the PR what you would have done if you had more time.
# Steps done to complete the assignment

### Task 1

#### Summary

Add a new function to the core package to:

- Ingest a snippet of code
- Output the most likely programming language

Ideally, this function should make use of a large language model (LLM) to detect the language, but you can use any method you prefer.

#### Motivation

Clients will be submitting code snippets to the API, and in order to improve the
customer experience, we want to automatically detect the programming language
instead of requiring the client to specify it.
1. Added utility function "read_input_content_from_file_or_string" to pypacter-cli/src/pypacter_cli/util.py file.
- This function returns the file content if filename is provided as argument otherwise returns the input itself.
2. Added detect_language function to src/pypacter/language_detector.py file
- In this function called GPT_4.invoke(prompt) to invoke the gpt4 model and provided the input prompt to identify the programming language.
- Used already initialised GPT_4 instance in models.py file.
3. Added unit test cases in tests/test_language_detector.py file.


### Task 2

#### Summary

Add a new API endpoint for the language detection function.
Added /detect-language api endpoint to pypacter-api/src/pypacter_api/base.py file.
- This api extract the input from request_body
- Calls the detect_language function from core package to get the output.
- Convert and returns the output in json using JSONResponse object

#### Motivation

Another team is building a feature that requires the language detection
functionality, and instead of duplicating the work, they have asked us to
expose the functionality via a new API endpoint.

### Task 3

#### Summary

Add a new CLI command for the language detection function. The CLI should
accept the code snippet either as a file path, or through standard input.

#### Motivation

The CLI is the primary way that developers interact with the API, and we want to
make sure that the new functionality is easily accessible.

### Task 4 (Optional)

Show-case your skills by adding a new feature of your choice to the core package. Ideally, this feature should make use of an LLM.
1. Added call_detect_language_api function to pypacter-cli/src/pypacter_cli/__init__.py file to api endpoint to get the output
2. Added process_input function to pypacter-cli/src/pypacter_cli/__init__.py file as cli command to take input from user and process the input and returns the output to cli.
3. Added the requirements.txt file.
22 changes: 22 additions & 0 deletions pypacter-api/src/pypacter_api/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from fastapi.responses import JSONResponse

from pypacter_api import get_version
from src.pypacter import detect_language

router = APIRouter()

Expand All @@ -33,3 +34,24 @@ async def version() -> JSONResponse:
A JSON response containing the version of the API.
"""
return JSONResponse(content={"version": get_version()})


@router.post("/detect-language")
async def detect_language(request_body: Dict[str, str]) -> Dict[str, str]:
"""
Endpoint to detect the programming language.

Args:
request_body (dict): JSON payload containing the "code_snippet".

Returns:
dict: A JSON response with the detected language.
"""
try:
code_snippet = request_body.get("code_snippet", "")
detected_language = detect_language(code_snippet)

return JSONResponse(content={"language": detected_language})

except Exception as e:
raise HTTPException(status_code=500, detail=f"Error occurred: {str(e)}")
45 changes: 45 additions & 0 deletions pypacter-cli/src/pypacter_cli/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import click
import rich.traceback
import rich_click
import requests
from rich.logging import RichHandler

from pypacter_cli.__version__ import __version__, __version_tuple__
Expand All @@ -33,6 +34,8 @@
"__copyright__",
"cli",
"get_version",
"process_input",
"read_input_content_from_file_or_string",
]

__author__ = "Joshua Ellis"
Expand Down Expand Up @@ -104,3 +107,45 @@ def get_version() -> str:
The version string.
"""
return __version__


def call_detect_language_api(code_snippet: str) -> Dict[str, str]:
"""
Calls the FastAPI endpoint to detect the programming language.

Args:
code_snippet (str): The code snippet to analyze.

Returns:
dict: A dictionary with the detected language.
"""
api_url = "http://127.0.0.1:8000/detect-language" # Replace with your actual API URL

try:
response = requests.post(api_url, json={"code_snippet": code_snippet})
response_data = response.json()
return response_data

except requests.RequestException as req_exc:
return {"error": f"Request error: {str(req_exc)}"}


@click.command()
@click.option('--input', prompt='Enter input (filename or string)', help='Input filename or string')
def process_input(input):
"""
Process input provided by the user, either from a file or directly as a string.

Args:
input (str): The input provided by the user, which can be a filename or a string.

Returns:
None

Prints:
The content of the input, either from a file or directly as a string.

"""
content = read_input_content_from_file_or_string(input)
result = call_detect_language_api(content)
click.echo(f'Programming language is :\n{result}')
25 changes: 25 additions & 0 deletions pypacter-cli/src/pypacter_cli/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,28 @@ def wrapper(*args: _P.args, **kwargs: _P.kwargs) -> _T:
return asyncio.run(f(*args, **kwargs))

return wrapper


def read_input_content_from_file_or_string(input: Union[str, os.PathLike]) -> str:
"""
Read the content from a file or use a string input.

Args:
input (Union[str, os.PathLike]): Either a filename or a string.

Returns:
str: The content read from the file if `input` is a valid filename,
otherwise returns `input` itself.

Raises:
FileNotFoundError: If `input` is a filename that does not exist.

"""
if os.path.isfile(input):
# Input is a filename
with open(input, 'r') as file:
content = file.read()
else:
# Input is a string
content = input
return content
35 changes: 35 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
anyio==3.6.2
asgiref==3.4.1
async==0.6.2
attrs==22.2.0
certifi==2024.2.2
charset-normalizer==2.0.12
click==8.0.4
commonmark==0.9.1
contextlib2==21.6.0
contextvars==2.4
dataclasses==0.8
fastapi==0.83.0
h11==0.13.0
idna==3.7
immutables==0.19
importlib-metadata==4.8.3
iniconfig==1.1.1
packaging==21.3
pluggy==1.0.0
py==1.11.0
pydantic==1.9.2
Pygments==2.14.0
pyparsing==3.1.2
pytest==7.0.1
requests==2.27.1
rich==12.6.0
rich-click==1.2.1
sniffio==1.2.0
starlette==0.19.1
tomli==1.2.3
typing==3.7.4.3
typing_extensions==4.1.1
urllib3==1.26.18
uvicorn==0.16.0
zipp==3.6.0
1 change: 1 addition & 0 deletions src/pypacter/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,5 @@
"__url__",
"__license__",
"__copyright__",
"detect_language",
]
40 changes: 40 additions & 0 deletions src/pypacter/language_detector.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
from models import GPT_4

def detect_language(sample_code:str):
"""
Detects the programming language of a given sample_code.

Args:
sample_code (str): The code snippet to analyze.

Returns:
str: The name of the detected programming language.
"""
prompt = [
("system", "You are a helpful assistant that detects programming languages."),
("human", sample_code),
]

# Detect the programming language
detected_language = GPT_4.invoke(prompt)

# Extract output
language = detected_language.content.strip()


def main():
"""
driver code to test detect_language function
"""
snippet1 = 'List<String> things = new ArrayList<>();'
snippet2 = 'console.log("Hello world");'

language1 = detect_language(snippet1)
language2 = detect_language(snippet2)

print(f"Snippet 1 is written in {language1}")
print(f"Snippet 2 is written in {language2}")


if __name__ == "__main__":
main()
27 changes: 27 additions & 0 deletions tests/test_language_detector.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
from src.pypacter import detect_language


def test_detect_language_valid_code():
snippet = 'List<String> things = new ArrayList<>();'
language = detect_language(snippet)
assert language == "Java"

def test_detect_language_invalid_code():
snippet = 'console.log("Hello world");'
language = detect_language(snippet)
assert language == "JavaScript"

def test_detect_language_empty_code():
snippet = ''
language = detect_language(snippet)
assert language == "Unknown"

def test_detect_language_exception():
with pytest.raises(Exception):
snippet = 'invalid_code'
detect_language(snippet)

# Add more test cases as needed

if __name__ == "__main__":
pytest.main()
Loading